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Preface 


This book has evolved from an introductory course in mathematics given to 
engineering students at the University of Newcastle-upon-Tyne during the 
last few years. It represents the author’s attempt to offer the engineering 
student, and the science student who is not majoring in a mathematical 
aspect of his subject, a broad and modern account of those parts of mathe- 
matics that are finding increasingly important application in the everyday 
development of his subject. 

_ Although this book does not seek to teach any of the many physical 
disciplines to which its results and methods may be applied, it nevertheless 
makes free use of them for purposes of illustration whenever this seems to be 
helpful. Every effort has been made to integrate the various chapters into a 
description of mathematics as a single subject, and not as a collection of 
seemingly unrelated topics. Thus, for example, matrices are not only intro- 
duced in an algebraic context, but they are also related in other chapters to 
change of variables in partial differentiation and to the study of simultaneous 
differential equations. 

Modern notation and terminology have been used freely but, it is hoped, 
never to the point of becoming pedantic when a simple word or phrase seems 
more natural. Of necessity, much of the material in this book is standard, 
though the emphasis and manner of introduction and presentation frequently 
differs from that found elsewhere. This is deliberate, and is a reflection of 
the changing importance of mathematical topics in engineering and science 
to-day. 

In many introductory mathematics texts for engineering and science 
Students no serious attempt is made to offer reasonable proofs of main 
results and, instead, attention is largely confined to their manipulation. 
Important though this aspect undoubtedly is, it is the author’s belief that 
knowledge of the proof of a result is often as essential as its subsequent 
application, and that the modern student needs and merits both. With this 
thought in mind proofs of results have always been included, and, though 
they have been kept as simple as possible, no attempt has been made to 
conceal difficulty where it exists. Only very occasionally, when the proof of 
a result is lengthy, and its details are largely irrelevant to the subsequent 
development of the argument, has the treatment been shortened to a summary 
of the logical steps involved. Even then the interested reader can often find 
more relevant information amongst the specially selected problems at the 
end of each chapter. 

As implied by the previous remark, the many problems not only comprise 
those offering manipulative exercise, but also those shedding further light 
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on topics only touched upon in the main text. No serious student can progress 
in his knowledge of this subject without a proper investment of time and effort 
spent working at a selection of these problems. The main text is provided with 
numerous illustrative examples designed to be helpful both when working 
through the text and when attempting the classified problems. It is hoped 
that their inclusion also makes the book suitable for private study. 

The wide range of material covered in this book represents rather more 
than would normally be contained in an introductory course of lectures. 
Whilst allowing for changing approaches in teaching, this fact also permits 
some flexibility in use of the material and at the same time offers further 
relevant reading to the ambitious student. In addition to the author’s own 
experience of the application of mathematics in engineering and science, the 
choice and style of presentation of material has been influenced by two 
recently published documents: the Council of Engineering Institutions 
syllabuses in mathematics in Britain and the CUPM recommendations made 
by the Mathematical Association of America. It is the author’s hope that 
this book complies fully with the former document and with the spirit of the 
latter insofar as its recommendations are applicable to engineering and 
science students. 

The material has all been class-tested and, as a result, has undergone 
considerable modification from its first appearance as lecture notes to the 
form of presentation adopted here. It is a pleasure to acknowledge the help 
of the publishers who have given me continued encouragement and every 
possible form of assistance throughout the entire period of preparation of 


the book. | 
A. J. 


As a direct result of requests by users of the first printing of this book it 
was decided that a short chapter on Fourier Series should be added. The 
present revised imprint contains this new material and also incorporates a 
“number of small corrections drawn to the author’s attention by various kind 


readers. 
A. J. 
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Introduction to sets and 
numbers 


1:1. Sets and algebra 


In applications of mathematics to engineering and science, we often use the 
properties of real numbers. Many of these properties are intuitively obvious, 
but others are more subtle and depend for their proper use on a simple 
understanding of the mathematical basis of the so-called real number system. 
This chapter describes the elements of the real number system in a straight- 
forward manner for subsequent use throughout the book. 

The reader will certainly know how to work with finite combinations of 
numbers, but what is less ceftain 15 whether he understands how to interpret 
and use limiting processes. For example, what is the meaning and what, if 
any, is the value to be associated with the limit 


l n 
lim (1 + -| | 
Th— GC nh 


which is to be interpreted as the value approached by the expression in square 
brackets as n increases without bound? 

It was questions such as these and, indeed, far simpler ones that first led 
to the study of real numbers. Many properties of numbers, nowadays accepted 
by all as self-evident, were once regarded as questionable. This 15 still clearly 
apparent from much of the notation that is in current use. 

Thus, for example, the fact that 1/2 cannot be expressed as the ratio of 
two integers led to its being termed an irrational number. Even more extreme 
is the term imaginary number that is given to ,/—1. Although, as we shall 
see later, this number does not belong to the real number system and so 
merits special consideration, it is however no less real than the integer 2. 

Experience suggests that in any systematic development of the properties 
of the real number system, the operations of addition and multiplication must 
play a fundamental role. These conjectures are of course true, but underlying 
the idea of real numbers and their algebraic manipulation are the even more 
fundamental concepts of sets and their associated algebra. Because these 
notions are sometimes unfamiliar, we shall start by considering some simple 
but important ideas concerning sets. 

We must first define the term set for which the alternative terms aggregate, 
class, and collection are also often used. Our approach will be direct and 
pragmatic and we shall agree that a set comprises a collection of objects or 
elements, each of which is chosen for membership of the set because it 
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possesses some required property. Membership of the set is determined en- 
tirely by this property; an object only belongs to the set if it possesses the 
required property, otherwise it does not belong to the set. The properties of 
membership and non-membership of a set are mutually exclusive. 

An important numerical-set which we shall often have occasion to use is 
the set N of natural numbers 1, 2, 3, . . ., used in counting. In future the 
symbol N will always be used to signify this natural set of positive integers. 
Notice that there can be no greatest member m of this set, since however 
large m may be, m + 1 is larger and yet is also a member of the set N. 
Accordingly, when we use a number m that is allowed to increase without 
restriction, it will be convenient to imply this by saying that ‘m tends to 
infinity’, and to write the statement in the form m — oo. Notice that infinity 
is not a number in the usual sense, but just the outcome of the mathematical 
process of allowing m to increase without bound. It is always necessary to 
relate the symbol oo to some mathematical expression, since by itself it has 
little or no meaning. 

N is only one type of set however, and from the wording of our definition 
it is apparent that the elements of a set need not be numerical. Thus in statistics 
one is concerned with sets of events which may or may not be numerical, 
whereas in the analysis of logical operations one is concerned with sets of 
decisions. The notation and simple algebra we now develop are applicable to 
all sets and, hence, to any situations such as those just enumerated which are 
capable of description in terms of sets. 

To simplify the manipulation of these ideas we must introduce a notation 
for elements of a set, for sets themselves, and for the membership of an 
element to a set. It is customary to denote general elements of sets by lower 


case letters a, b, . . ., x, . . ., and sets themselves by capital letters A, B, 
. 9, ... If @ is a member of set A we shall write 
acaA. 


This is usually read ‘a is an element of A’. Conversely, if a is not an element of 
A we shall write 


aé A. 


In this notation we have 3ῈΕ Ν, but 7 ¢ N, where π = 3-1415..., and N is 
the set of natural numbers. 

If a set only contains a small number of elements it is often simplest to 
define it by enumerating the elements. Hence, for a set S comprising the four 
integer elements 3, 4, 5, and 6 we would write S = {3, 4, 5, 6}. This set is a 
finite set in the sense that it comprises a finite number of elements. Con- 
versely, the set N of natural numbers is an infinite set since it contains an 
infinite number of elements. 

Often it is useful to have a notation which indicates the membership 
criterion that is to be used for the set. Thus, if we were interested in the set B 
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of positive integers # whose squares lie between the positive numbers #2 and 
2m, we would write 


B= ({nlneN,m <n? < 2m. 


Here we have used the convention that the symbol v to the left of the vertical 
rule signifies a general element of the set in question, whilst the expressions to 
the right of the rule express the membership criteria for the set. There, of 
course, the symbol < when used in conjunction with numbers a and ὁ in 
the form a < ὁ 15 to be read ‘a less than ῥ᾽. 

An important set that is frequently used ts the set of ordered pairs. An 

element of this set will be written (m, ), where m and 7 are not necessarily 
numbers and the element (m, ») is different from the element (”, m) unless 
m and n are identical. An important use of this set is in the construction of 
tables, when the ordered pair becomes an ordered number pair, the first 
member of which 15 usually the argument and the second member the func- 
tional value. Hence the ordered number pair (37, 0-5) could refer to the 
sine of the angle 47 radians. In this example the relationship between the 
first and second numbers of the ordered pair is determinate since sin ἐπ = 0.5, 
but this is not always the case with ordered pairs. Thus if the ordered pair of 
integers (m,n) were used to describe the throw of a die in a series of N 
trials, as the statistician would call them, then m could represent the number 
of the throw or the trial number, and n the score resulting from that throw. 
Here m would range from unity to N, the number of trials in the statistical 
experiment, and would be any integer between 1 and 6. There would then 
be no rule by which x could be predicted for any given m. 
_ Ordered number pairs are also encountered when constructing graphs of 
functions where the convention is usually that (a, δ) signifies the point with 
x-coordinate a and y-coordinate ὁ. Thus the graph of the function y = f(x) 
for which x is between a and ῥ could be written in set notation 


S = {(x, f(x))|a << x < bh. 


The notation of an ordered pair as an element of a set readily extends to 
an ordered triple (m, n, r), which again need not necessarily involve numerical 
quantities, nor need it be determinate. Again, two ordered triples will only be 
identical if their corresponding entries are identical. Ordered number triples 
of a determinate kind occur when considering the graph of a function of two 
independent variables as, for example, the equilibrium temperature at a 
given point of a cross-section of a very long metal bar. 

Statistical events provide the most common source of ordered triples of 
the indeterminate variety. As a simple illustration we may consider the 
statistical experiment comprising N trials, each of which involves tossing a 
coin twice and recording the results of each throw as a ‘head’ (#) or a ‘tail’ 
(7). Then the first quantity in the ordered triple could record the trial number 
with the second and third quantities recording an H or a T according as the 
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first and second throws gave rise to a ‘head’ or a ‘tail’. A typical ordered 
triple would then be (3, 7, H) in which the second and third entries in the 
ordered triple cannot be predicted from a knowledge of the first entry. 

It is often necessary to study relationships between sets and for this pur- 
pose an algebra of sets must be constructed. The simplest situation that can 
occur 15 that from a set A, a new set B is formed, such that all elements 
of B are also elements of A. Such a set B will be called a subset of A. This 
result will be written 


BCA, 


which is to be read ‘B is a subset of A’. 

If x is an element of A, so that we may write χ ε A, then either χ ε B, 
or x € B. When there are some elements x’ € A which are not to be found in 
B, so that x’ ¢ B, then B is called a proper subset of A, the result being written 


BCA, 


The definition of a subset B of A does not preclude the possibility that 
for every element χε A it is also true that x ε B. When this occurs sets A 
and B have the same elements and are said to be equal, the result being 
written 


A = B. 


It is clear from the definition of equality that when A = B both the 
statements A < B and B < A must be true. These last two statements are 
often useful as an alternative definition of equality between sets. 

With the above definitions it is clear that if A = N and B = {1, 2, 3, 4, 5}, 
then Bc A; whereas if A = {4,7,3,5,9} and B = {7, 4,5, 9, 3}, then 
Ac Band BC Asothat A = B. 

A more general situation arises when two sets A and B are involved, each 
of which possesses elements which are not common to the other so that 
neither statement A < B, nor BC A is true. The set of elements C that is 
common to these two sets A and B will be called the intersection of the sets 
A and B and is written 


C=ANB. 


Sometimes this is read ‘A cap δ᾽ with the understanding just defined. 
In the event that there are no elements common to the sets 4 and B we 
shall write 


ANB= 4, 


with the understanding that ¢ is the nu// set, which we define to be the set 
containing no elements. Under these circumstances the sets A and B are said 


to be disjoint. 
By way of example, if 41 = {a, b, 1, 3,5, 7} and Bi = {a, c, d, e, 3, 7, 9}, 
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ACB. 


(a) (b) (c) 

Fig. 1-1 Symbolic representation of set operations: (a) proper subset; (by inter- 
section; (c) union. 
then 410 Bi = {a, 3, 7}; whereas if 45 = {1, 3, 7} and Be = {0, 4, 9, 11}, 
Ag Bo = φ. 

Another important set related to sets A and B is the set C containing all 
the elements belonging to A, to B or to both 4 and B. This is called the union 
of sets A and B and is written 


C=AwUB: 


which reads ‘A cup B’. With the sets defined above we obviously have 


A, VU Bi = {a, b,c, d,e, 1,3,5,7,9} and AoVU Bo = {0, 1, 3,4, 7,9, 11}. 
Clearly, for any set A we haved C A, AUGP=A,andANG=¢g. 

These seemingly abstract ideas can be illustrated symbolically by means 
of a very convenient device. This is the so called Venn diagram, which uses a 
pictorial representation for the sets in question. Sets are represented by the 
interior of closed curves, usually of arbitrary shape, and their relationship is 
then illustrated by the relationships that exist between these curves. Thus, 
when as in Fig. 1-1 (a) curve A representing set A lies within curve B repre- 
senting set B, we have the situation that A is a proper subset of B, so that 
A « Β. Figs 1-1 (b), (c) illustrate, respectively, the intersection A ὦ B and 
the union A U B of sets A and B, which are shown as shaded areas on those 
figures. 


AN Bee 


(a) 
Fig. 1-2 Sets in plane: (a) intersection; (Ὁ) union. 
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In general this representation is only symbolic, but in the event that 
elements of the sets A and B may be unambiguously represented by points 
in the plane, the Venn diagrams become true representations. 

Let set A comprise all the points within and on a circle of unit radius, 
usually called a unit circle, and centred on the origin, and let B comprise all 
the points within and on the circle of radius 2 centred on the point x = 2:5 
on the x-axis. Then the relationships A Ὁ B and A ὦ B are truly represented 
by the shaded areas in Figs 1:2 (a), (b). 

Similarly, if we consider the sets A and B defined by the interiors and 


AN B= {1} ANB=@ 
(a) (b) 
Fig. 1.3 Intersection of sets in the plane: (a) single point contained in intersection; 
(b) disjoint sets. 


boundaries of the two unit circles illustrated in Figs 1-3 (a), (b), we see that in 
(a), A 0 B = {1}, so that only the single point x = 1 on the x-axis is common 
to A and B, whereas in (Ὁ), AN B= φ. 

A final idea we now introduce in connection with sets A and B is the 
complement of B relative to A, which we shall write as A\B. This is a generali- 
zation of the notion of subtraction and comprises the set of elements of A 
that do not belong to B. The expression A\B is usually read ‘A minus B’ 
and if, for example, A = {a, 1, 3, 7} and B = {a, 7, 9, 11} then A\B = {1, 3}. 
Appealing again to a Venn diagram, we illustrate this relationship by the 
shaded region in Fig. 1-4. 


ΑΔΒ ἃ 


Fig. 1.4 Symbolic representation of complement of B relative to A. 
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The following useful results are almost self-evident and are true for 
arbitrary sets A, B, and C. They may be proved either from the basic defini- 
tions, or by appeal to Venn diagrams. 


Basic set operations 


AVA=ANA=A, (1-1) 
ANB=BQNOA, (1-2) 
AUB=BUA, (1:3) 
(AUBUC=AU(BUC), (1-4) 
(ANB)NC=AN(BN OC), (1-5) 
AU(BNC)=(AUB)N(AVUO), (16) 
AN(BUC)=(ANB)U(ANC). (1:7) 


From these there follows an important theorem due to De Morgan: 


THEOREM 1-1 For any three arbitrary sets A, B, and C it is true that 
A\(B U (ΟἹ = (A\B) ἡ (A\C) 


and 


A\(B Ἃ ΟἹ = (A\B) VU (A\C). 


Proof An analytical proof of the first stated result involves the following 
two steps: (a) the proof that if x is an arbitrary element such that 
x € A\(B U C), then x ε (A\B) and x € (A\C), showing that 


A\(BU C) € (A\B)N (A\C); 


and (Ὁ) the proof that if xe(A\B) and xe(A\C), then xe A\(BU C), 
showing that 


(A\B) 0 (A\C) © A\(BUC). 


Then by our alternative definition of the equality of two sets P and Q, 
whereby P = Q if PS Q and Q CPP, the result will follow. The details, 
which are not difficult, are left to the reader. The proof of the second stated 
result follows on similar lines. 


The theorem may be illustrated in general terms, and proved for sets 
which may be represented by points in a plane, by the use of Venn diagrams. 
The three diagrams appropriate to the first stated result are shown in Figs 
1-5 (a), (b), and (c), where the shaded regions represent the sets A\B, A\C, 
and A\(B U C), respectively. 

The reader will have noticed that it is a feature of basic set operations 
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A\B | A\C A\(B U C) 
(a) (b) (c) 


Fig. 1-5 Representation of De Morgan’s theorem. 


that they essentially combine two sets to generate a third in an unambiguous 
manner. It is because of this simple property that operations such as union 
and intersection are called binary set operations, the term ‘binary’ referring 
to the two sets on which the set operation is performed to generate the third. 
Thus the operation M acting on any two sets A and B generates a third set 
C = AQ Bwhere, of course, C will be the null set if A and B have no common 
elements. 

Theorem 1.1 illustrates that operations on sets are not always as simple 
as the formation of the union or intersection of sets. Accordingly, it is neces- 
sary to appreciate clearly the implication of any statement that may be made 
in the derivation of a result. These statements may either be ‘one way’ implica- 
tions or ‘two way’ implications in the following sense. An implication will 
be said to be one way if it is a simple statement of the form ‘result A implies 
result B’. This statement is usually written symbolically in the concise form 


A => Β. 


A two way implication arises if from the above statement it also follows 
that ‘result B implies result 4’, so that in addition to the previous statement 
it is also permissible to write 


B= A. 


Rather than write for a two way implication the two results A = B and 
B = A, the notation is contracted so that the two way implication may be 
written concisely in the form 


A <= B, 


The symbol < is usually read ‘implies and is implied by’. 
Two simple illustrations using sets of integers should clarify these remarks. 
We can only write 


a= 1 - ais an integer, 


since the converse statement, a is an integer, does not imply that a = 1. 
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However, we may obviously write 
integer n contains a factor 2 <=- ἡ is an even integer. 


Formal development of these and similar ideas is essential if the logical 
structure of mathematics:is to be fully appreciated, though these matters 
will not be pursued further in this introductory account. 


1:2 Set theory and probability 


One of the most direct applications of the elements of set theory 15 to be 
found in a formal introduction to probability theory. Because the notion of 
a probability is fundamental to many branches of engineering and science we 
choose to introduce some basic ideas and definitions now, making full use 
of the notions of set theory. This will serve a dual purpose in that it will 
provide an excellent illustration of a specific application of set theory, whilst 
at the same time introducing an important concept at the very outset of our 
study. 

In some situations the outcome of an experiment is not determinate, so 
one of several possible events may occur. Following statistical practice we 
shall refer to an individual event of this kind as the result or outcome of a 
trial, whereas an agreed number of trials, say N, will be said to constitute an 
experiment. If an experiment comprises throwing a die N times, then a trial 
would involve throwing it once and the outcome of a trial would be the score 
that was recorded as a result of the throw. The experiment would involve 
recording the outcome of each of the N trials. 

In general, if a trial has m outcomes we shall denote them by Fj, E2,. . ., 
Em and refer to each as a simple event. Hence a trial involving tossing a coin 
would have only two simple events as outcomes: namely ‘heads’, which 
could be labelled Εἰ. and ‘tails’, which would then be labelled £2. In this 
instance an experiment would be a record of the outcomes from a given 
number of such trials. A typical record of an experiment involving tossing a 
coin eight times would be ΕἸ, Es, Εἰ, Εἰ, Εἰ, E2, E2, Ei. With such a simple 
experiment the Εἰ, Ez notation has no apparent advantage over writing H in 
place of Εἰ and T in place of Ee to obtain the equivalent record H, 7, H, #7, 
H, T, T, H. The advantage of the Εἰ; notation accrues from the fact that the 
subscript attached to the E may be ordered numerically, thereby enabling 
easier manipulation of the outcomes during analysis. 

Events such as the result of tossing a coin or throwing a die are called 
chance or random events, since they are indeterminate and are supposedly 
the consequence of unbiased chance effects. Experience suggests that the 
relative frequency of occurrence of each such event averaged over a series of 
similar experiments tends to a definite value as the number of experiments 
increases. 

The relative frequency of occurrence of the simple event ΕἸ in a series of 
N trials is thus given by the expression 
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Number of occurrences of event E; 
a ae 


By virtue of its definition, this ratio must either be positive and less than unity, 
or be zero. For any given N, this ratio provides an estimate of the theoretical 
ratio that would have been obtained were N to have been made arbitrarily 
large. This theoretical ratio will be called the probability of occurrence of 
event Εἰ and will be written P(£;). In many simple situations its value may be 
arrived at by making reasonable postulates concerning the mechanisms 
involved in a trial. Thus when fairly tossing an unbiased coin it would be 
reasonable to suppose that over a large number of trials the number of 
‘heads’ would closely approximate the number of ‘tails’ so that Ρ(Η) = P(T) 
= }. Here, of course, ΡΠ) signifies the probability of occurrence of a ‘head’ 
and P(T) signifies the probability of occurrence of a ‘tail’. 

If there are m outcomes Εἰ, Ee,. . ., Em of a trial, and they occur with 
the respective frequencies 11, n2,. . ., 4m ina series of N trials, then we have 
the obvious identity 


πὶ +ng++ + + ἔπι 


= |. 
N 
When JN becomes arbitrarily large we may interpret each of the relative 
frequency ratios n/N (i = 1,2,. . ., m) occurring on the left-hand side as 


the probability of occurrence P(£;) of event Εἰ, thereby giving rise to the 
general result 


P(Ey) + P(E2) ++ + + + P(Em) = 1. (1-8) 


By this time a careful reader will have noticed that the definition of 
probability adopted here has a logical difficulty associated with it, namely, 
the question whether a relative frequency ratio such as n/N can be said to 
approach a definite number as N becomes arbitrarily large. We shall not 
attempt to discuss this philosophical point more fully, but rather be content 
that our simple definition in terms of the relative frequency ratio is in accord 
with everyday experience. | 

An examination of Egn (1-8) and its associated relative frequency ratios 
is instructive. It shows the obvious results that: 


(a) if event £; never occurs, then m; = 0 and P(£;) = 0; 

(b) if event Εἰ is certain to occur, then πὸ = N and P(£;) = 1; 

(c) if event δὲ occurs less frequently than event δ), then πὶ < nj and 
P(Ei) < P(E)); | 


(4) if the m possible events £1, Es, . . ., Em occur with equal frequency, 
then "1 =A ΞΞ' " "ΞΞ ἤγ, = N/mand P(E}) => P(E2) Se ΝΣ ἘΞΞΞ P(Em) 
= |/m. 


The relationship between sets and probability begins to emerge once it is 


Score on die 2 
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appreciated that a trial having m different outcomes is simply a rule by which 
an event may be classified unambiguously as belonging to one of m different 
sets. Often a geometrical analogy may be used to advantage when representing 
the different outcomes of a particular trial and such an approach then leads 
directly to a representation closely approximating the Venn diagrams of the 
previous section. 

A convenient example is provided by the simple experiment which involves 
throwing two dice and recording their individual scores. There will be in 
all 36 possible outcomes which may be recorded as the ordered number 
pairs (1, 1), (1,2), (1, 3),. . ., (2, 1), (2, 2), . . ., (6, 5), (6, 6). Here the first 
integer in the ordered number pair represents the score on die 1 and the second 


the score on die 2. These may be plotted as 36 points with integer coordinates 
as shown in Fig. 1-6 (a). 


6 
5 
σὴ 
a 4 
= 
δ 
v 
Φ 
YQ 
τὰ 
2 
Ϊ 2 3 4 3 6 1 2 3 4 5 6 
Score on die | Score on die | 


(a) (b) 


Fig. 1-6 Sample space for two dice: (a) complete sample space; (b) sample space 
for specific outcome. 


Because each of the indicated points in Fig. 1-6 (a) lies in a two- 
dimensional geometrical space (that is, they are specific points in a plane), 
and in their totality they describe all possible outcomes, the representation is 
usually called the sample space of events. The probability of occurrence of 
an event characterized by a point in the sample space is, of course, the 
probability of occurrence of the simple event it represents. 

As a sample space will require a ‘dimension’ for each of its variables it 
is immediately apparent that only in simple cases can it be represented 
graphically. Nevertheless the idea is still useful, as was that of the Venn 
diagram even when it was only symbolic. 


The points in the sample space may be regarded as defining points in a 
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set D so that specific requirements as to the outcome of a trial will define a 
subset A of D, at each point of which the required event will occur. Typical 
of this situation would be the case in which a simple event 15 the throw of two 
dice, and the requirement defining the subset is that the combined score after 
throwing the two dice equals or exceeds 8. Here the set D would be the 36 
points within the square in Fig. 1-6 (Ὁ) and the set A the 15 points within 
the triangle. Using set notation we may write A © D. 

The sample space representation becomes particularly valuable when 
trials are considered whose outcome depends on the combination of events 
belonging to two different subsets A and B of the sample space. Thus, again 
using our previous example and taking for A the points within the triangle in 
Fig. 1-6 (b), the points in B might be determined by the requirement that the 
combined score be divisible by the integer 3. The set of points B is then those 
contained within the dotted curves of Fig. 1-6 (b). 

A new set C may be derived from two sets A and B in two essentially 
different ways according as: 


(a) C contains points in A or B or both; 
(b) C contains points in A and B. 


If desired, these statements about sets may be rewritten as statements 
about events. This 15 so because there is an unambiguous relationship between 
an event and the set of points Sin the sample space at which that event occurs. 
Thus, for example, we may paraphrase the first statement by saying, the event 
corresponding to points in C denotes the occurrence of the events corresponding 
to points in A or B, or both. Because of this relationship it is often convenient 
to regard an event and the subset of points it defines in the sample space as 
being synonymous. 

The statements provide yet another connection with set theory, since in 
(a) we may obviously write C = AU B, whereas in (b) we must write 
C = AB. In terms of the sets A and B defined in connection with Fig. 
1-6 (b), the set C = A U B contains the points in the triangle together with 
those within the two dotted curves exterior to the triangle. The set C = A B 
contains only the five points within the two dotted curves lying inside the 
triangle. 

Here it should be remarked that the statistician usually avoids the set 
theory symbols U and 4, preferring instead to denote the union of A and 8 
by A + B and their intersection by AB. This largely arises because of the 
duality we have already mentioned that exists between an event and the set 
of points it defines; the statistician naturally preferring to think in terms of 
events rather than sets. However, to emphasize the connection with set 
theory we shall preserve the set theory notation. 

Using this duality we now denote by P(A) the probability that an event 
corresponding to a point in the sample space lies within subset A, and define 
its value to be as follows: 
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DEFINITION 11] 


P(A) is the sum of the probabilities associated with every point belonging 
to the subset A. 


In Fig. 1-6 (Ὁ) the set A contains the 15 points within the triangle and, 
since for unbiased dice each point in the sample space is equally probable, 
it follows at once that the probability 1/36 is to be associated with each of 
these points. Hence from our definition we see that in this case, P(A) 
= 15 x (1/36) = 5/12. Similarly, for the set B comprising the 12 points con- 
tained within the dotted curves we have P(B) = 12 x (1/36) = 1/3. 

We can now introduce the idea of a conditional probability through the 
following definition. 


DEFINITION 1:2 


P(A|B) is the conditional probability that an event known to be associated 
with set B is also associated with set A. 


Clearly we are only interested in the relationship that exists between A and 

B, with B now playing the part of a sample space. Because in Definition 1-2 

B plays the part of a sample space, but is itself only a subset of the complete 

sample space, it is sometimes given the name of the reduced sample space. 
In terms of set theory Definition 1-2 15 easily seen to be equivalent to 


P(A B) 


, (1-9) 
which immediately shows us how P(A|B) may be computed. Namely, 
P(A|B) is obtained by dividing the sum of the probabilities at points belonging 
to the intersection A M B of sets A and B by the sum of the probabilities at 
points belonging to B. This ensures that P(B|B) = 1 as would be expected. 

We can illustrate this by again appealing to the sets A and B defined in 
connection with Fig. 1-6 (b). It has already been established that P(B) = 1/3, 
and since there are only five points in A Ὁ B, each with a probability 1/36, 
it follows that P(A ὦ B) = 5/36. Hence P(A|B) = (5/36)/(1/3) = 5/12. This 
result expressed in words states that when two dice are thrown and their 
score is divisible by the integer 3, then the probability that it also equals or 
exceeds 8 1s 5/12. 

A direct consequence of Eqn (1:9) is the so called probability multiplication 
rule: 


THEOREM 1-2 If two events define subsets A and B of a sample space, then 
P(A Ὁ ΒΚ) = P(B)P(A|B). 


Sometimes, when it is given that the event corresponding to points in 
subset B occurs, it is also true that P(A|B) depends only on 4, so that 
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P(A|B) = P(A). The events giving rise to subsets A and B will then be said 
to be independent. The probability multiplication rule then simplifies in an 
obvious manner which we express as follows: 


Corollary 1.2 If the events giving rise to subsets A and B of a sample space 
are independent, then. 


P(A Ὁ Β) = P(A)P(B). 


Consideration of the interpretation of P(A U B) leads to another impor- 
tant result known as the probability addition rule: 


THEOREM 1:3 If two events define subsets A and B of a sample space, then 
P(A U B) = P(A) + P(B) — P(A CO B). 


The proof of this theorem is self-evident once it is remarked that when 
computing P(A) and P(B) from subsets A and B and then forming the expres- 
sion P(A) + P(B), the sum of probabilities at points in the intersection 


A ™ Bis counted twice. Hence P(A) + P(B) exceeds P(A ὦ B) by an amount 
P(A OB). 


The probability addition rule also has an important special case when 
sets A and B are disjoint so that A τὰ B = φ. When this occurs the events 
corresponding to sets A and Bare said to be mutually exclusive and we express 
the result as follows: 


Corollary 1.3 If the events giving rise to subsets 4 and B of a sample space 
are mutually exclusive, then 


P(A ὦ B) = P(A) + P(B). 


As a simple illustration of Theorem 1:3 we again use the sets A and B 
defined in connection with Fig. 1-6 (b) to compute P(A ὦ B). The result is 
immediate for we have already obtained the results P(A) = 5/12, P(B) = 1/3, 
and P(A Ὁ B) = 5/36, so from Theorem 1:3 follows the result 


P(A UB) = 5/12 + 1/3 — 5/36 = 11/18. 


The applications of these theorems and their corollaries are well illustrated 
by the following simple examples. 


Example 11. A bag contains a very large number of red and black balls in 
the ratio 1 red ball to 4 black. If 2 balls are drawn successively from the bag 
at random, what is the probability of selecting 


(a) 2 red balls, 


(b) 2 black balls, 
(c) 1 red and 1 black ball? 
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Let 41 denote the selection of a red bail first (and either colour second), 
and 42 the selection of a red ball second (and either colour first). Then 
A, © Az is the selection of 2 red balls and, similarly, By A Bz is the selection 
of 2 black balls. As the balls occur in the ratio 1 red : 4 black it follows that 
their relative frequency ratios are 1/5 for a red ball and 4/5 for a black ball, so 
P(A1) = 1/5 and P(B1) = 4/5. 

The fact that the bag contains a /arge number of balls implies that the 
drawing of one or more balls does not materially alter the relative frequency 
ratio that existed at the start, so P(A,) = 1/5 and P(B,) = 4/5. This, together 
with the fact that the balls are drawn at random, implies that the drawing of 
each ball is an independent event. The independence of events A and B then 
allows the use of Corollary 1-2 to determine the required solutions to (a) and 
(b). We find that 

(a) P(A1 Ὁ 42) = (1/5). (1/5) = 1/25, 

(b) P(B1 O Bo) = (4/5) . (4/5) = 16/25. 

Now to answer (c) we notice that there are two mutually exclusive orders 
in which a red and a black ball may be selected. Namely as the event CU D 
where C = A; ἢ Be (red then black) and D = By, ὦ Ag (black then red). 
From Corollary 1:3 we then have that P(C ὦ δ) = P(C) + P(D), where 
P(C) and P(D) are determined by Corollary 1-2. This shows that P(C) 
= P(A1)P(B2) and P(D) = P(B1)P(A2), so that P(C) = P(D) = (1/5) . (4/5) 
= 4/25. The solution to (c) becomes 

P(CU D) = 4/25 + 4/25 = 8/25. 

The three forms of selection (a), (b), and (c) are themselves mutually 
exclusive and it must follow that P(A1 Ὁ 4.9) + P(B1 0 Be) + P(CU D) =}, 
as is readily checked. Indeed this result could have been used directly to 
calculate P(C U δ) from P(Ai Ὁ Az) and P(B1 Ὁ Be) in place of the above 
argument using Corollary 1:3. | 


The previous situation becomes slightly more complicated if only a 
limited number of balls are contained in the bag. 


Example 12 A bag contains 50 balls of which 10 are red and the remainder 
black. If 2 balls are drawn successively from the bag at random, what is the 
probability of selecting 


(a) 2 red balls, 
(b) 2 black balls, 
(c) 1 red and 1 black ball? 


This time the approach must be slightly different because, unlike Example 
1-1, the removal of a ball from the bag now materially alters the probabilities 
involved when the next ball is drawn. In fact this is a problem involving 
conditional probabilities. 

Here we shall define A to be the event that the first ball selected is red, 
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and B to be the event that the second ball selected is red. The probability we 
must now evaluate is the probability of occurrence of event B given that 
event A has occurred. Expressed in set notation we have to find P(A ὦ B), 


the probability of occurrence of the event associated with AM B. This is a 
conditional probability with the set associated with event A playing the role 
of the reduced sample space. Utilizing this observation we now make use of 
Theorem 1-2 to write 


P(A τὰ B) = P(A)P(BIA). 


Now the relative frequency of occurrence of a red ball at the first draw is 
10/10 + 40) = 1/5, so that P(A) = 1/5. (Not till later will we use the fact 
that the relative frequency of occurrence of a black ball is 40/(10 + 40) 
= 4/5.) 

Given that a red ball has been drawn, 9 red balls and 40 black balls remain 
in the bag. If the next ball to be drawn is red then its probability of occurrence 
is the conditional probability P(B|A) = 9/9 + 40) = 9/49. Hence it follows 
that the solution to (a) is 


P(A Ὁ B) = (1/5) . (9/49) = 9/245. 


It is interesting to compare this with the value 1/25 that was obtained in 
Example 1-1 on the assumption that there was virtually an infinite number of 
balis in the bag. 

If C is defined to be the event that the first ball drawn is black and D the 
event that the second ball drawn is black, then to answer (b) we must compute 
P(C ἢ D). Obviously, P(C) = 4/5, and by using an argument analogous to 
that above it follows that P(D|C) = 39/(10 + 39) = 39/49. Hence the solu- 
tion to question (b) is 


P(C τὰ D) = (4/5) . (39/49) = 156/245. 


Again this should be compared with the value 16/25 obtained in Example 
11: 3 : 
The simplest way to answer (c) is to use the fact that events (a), (b), and 
(c) describe the only possibilities and so are mutually exclusive. Hence the 
sum of the three probabilities must equal unity. Denoting the probability of 
event (c) by P we have 


P=1-— P(AN B)— P(CN ἢ), 
showing that P = 1 — 9/245 — 156/245 = 16/49. 
It is sometimes helpful to bear in mind the following table in which 


equivalent statements are expressed using the alternative languages of sets 
and probability theory. 
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Sets Probability 


AUB=C A + B= C; the event corresponding to C is defined as 
the occurrence of at least one of the events corres- 
ponding to A or B or both. 


ANB=C AB = C; the event corresponding to C is defined as 
the occurrence of both of the events corresponding to 
A and B. 

ANB=¢ AB = 0; events corresponding to A and Bare mutually 
exclusive. 

A=@¢ _ A =0; the event corresponding to A does not occur. 

BCA B= A; the event corresponding to B implies that 


corresponding to A. 


A\B the event corresponding to A and not that corres- 
ponding to B. 


To close this section with a brief examination of repeated trials, the ideas 
of a permutation and a combination must be utilized. The student will already 
be familiar with these concepts from elementary combinatorial algebra and 
so we shall only record two definitions. 


DEFINITION 1:3 A permutation of a set of n mutually distinguishable 
objects r at a time is an arrangement, or an enumeration of the objects, in 
which their order of appearance counts. 


Thus of the five letters a, ὃ, c, d, e the arrangements a, ὃ, c and a, c, ὃ 
represent two different permutations of three of the five letters. These are 
described as permutations of five letters taken three at a time. Other permuta- 
tions of this kind may be obtained by further re-arrangement of the letters 
a, b, c and by the replacement of any of them by either or both of the 
remaining two letters d and e. 

The total number of different permutations of n objects r at a time will be 
denoted by “P, and it is left to the reader to prove as an exercise that 


n! 
~ (n—nt 


where n! (factorial n) = n(n --- 1)(n — 2). . .3.2.1, and we adopt the 
convention that 0! = 1. 


nP, (1-10) 


DEFINITION 1-4 A combination of a set of n mutually distinguishable 
objects r at a time is a selection of r objects from the n without regard to 
their order of arrangement. 
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It follows from the definition of a permutation that a set of r objects may 
be arranged in r! different ways so that denoting the number of different 
— n 
combinations of n objects r at a time by ( "), we must have 


nh 
p= ri ("). 


This gives the important result 


(ἢ = a= at up 


In many books it will often be found that the expression ”C; is written in 
place of ("). The numbers (") are usually called binomial coefficients 


because of their occurrence in the binomial expansion 


n 
(p+qg"=)> (") p’ q®-", with n a positive integer. (1:12) 
r=0 
Now consider an experiment involving a series of independent trials in 
each of which only one of two events A or B may occur. Then if the prob- 
abilities of occurrence of events A and B are p and 4, respectively, we must 
obviously have p + 4 = 1. Ifnsuch trials constitutes an experiment, we might 
wish to know with what probability the experiment may be expected to yield 
r events of type A. The statistician will call such a situation repeated inde- 
pendent trials. 
An experiment will be deemed to be successful if r events of type A and 
n—r events of type B occur, irrespective of their order of occurrence. 


Clearly this can happen in (") different ways and by Corollary 1:2, since 


the trials are independent, the probability of occurrence of any one of these 
events will be p"(1 — ρ)} 7. Hence, as the results of trials are also mutually 
exclusive, it follows from Corollary 1:3 that the required probability P(r) of 
occurrence of r events of A each with probability of occurrence p in n 
independent trials is 


P(r) = (") pra = pm (13) 


Identifying the p and φ of Eqn (1-12) with the probabilities of occurrence 
of the events A and B just discussed, we see that g = | — p, so that Eqn (1-12) 
takes the form — 


i= Σ [} pri — pyr". (1-14) 


f= 


Each term on the right-hand side of Eqn (1-14) then represents the probability 
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of occurrence of an event of the form just discussed. For example, the first 
term 


PO) = (5) a —p" 


is the probability that event 4 will never occur in a series of n independent 
trials, whilst the third term 


PQ) = (3) pXl — pyr 


is the probability that event A will occur exactly twice in a series of ἢ indepen- 
dent trials. 

Then + | numbers P(r), r = 0,1,. . .,m have, by definition, the property 
that 


PO) + Ρ() +-° + + PQ) = 1, (1-15) 


and they are said to define a discrete probability distribution. It is conven- 
tional to plot them in histogram fashion when they illustrate the probabilities 
to be associated with the n + I possible outcomes of an experiment involving 
n trials. Fig. 1-7 (a) illustrates the case in which ἢ = 4 and p = } so that 


0:5 (n= 4, -- 1) 


(a) (b) 
Fig. 1.7 Binomial distribution: (a) binomial probability density function: (b) 
binomial cumulative distribution function. 


4\ /1\9°/3\4 8] 1\1/3\3 27 bes 
P(O) = (;) (3) (3) - 556" PU) = ( )(3) (5) = Ee and, similarly, 
P(2) = 54/256, P(3) = 3/64, and P(4) = 1/256. Because of the origin of this 


distribution, Eqn (1:13) is said to define the binomial distribution. This 
distribution is historically associated with Jacob Bernoulli (1654-1705) and 
experiments of the type just examined are sometimes referred to as Bernoullian 
trials. When the cumulative total 
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U(r) = > PO, (1-16) 


is plotted in histogram fashion against r the result is called the cumulative 
distribution function. The cumulative distribution function corresponding to 
Fig. 1-7 (a) is shown in Fig. 1-7 (Ὁ). It is conventional to refer to the P(r) 
as the probability density function or the frequency function since it describes 
the proportion of observations appropriate to the value of r. 


Example 1.3 If an unbiased coin is tossed six times, what is the probability 
that only two ‘heads’ will occur in the sequence of results ? 
As the coin is unbiased p = g = 3 and so 


ro (UNG) - 


It is an immediate consequence of Eqn (1-13) that 


(a) if A occurs with probability p in independent trials then the prob- 
ability that it will occur at /east r times in n trials is 


> (") ps — pyr’; 


(b) if A occurs with probability p in independent trials then the probability 
that it will occur at most r times in n trials is 


Σ (ἢ »Ὁ = ps 


and to this we may add Egn (1-13) in this form: 


(c) if A occurs with probability p in independent trials then the prob- 
ability that it will occur exactly r times in ἢ trials is 


| (") pap. 


Example 1.4 What is the probability of hitting a target when three shells 
are fired, assuming each to have a probability ἐ of making a hit? 

Obviously here p = ἐ and we will have satisfied the conditions of the 
question if at /east one shell finds the target. Accordingly, using (a) above, 
the result is 


2,(;)(a) 4-9 


Hence the required probability is ὃ + ὃ + 3 = §. 


So far the sample spaces we have used have involved discrete points, and 
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it is for this reason that the term discrete has been used in conjunction with 
the definition of the binomial distribution. In other words, in discrete dis- 
tributions, no meaning is to be attributed to points that are intermediate 
between the discrete sample space points. In particular, referring to Fig. 
1-6 (a), there is no score to be attributed to a point with horizontal co- 
ordinate 2:5 and vertical coordinate 4-2 any more than there is to the point 
with horizontal coordinate 11 and vertical coordinate 9. 

However, situations occur in which perfectly satisfactory sample spaces 
can be defined which associate an event with every point of the sample space, 
and not just certain discrete points. The definition of a distribution function 
appropriate to this case requires ideas from the calculus and will not be 
discussed here. In statistics such distributions are called continuous distri- 
butions. 


1.3 Integers, rationals and arithmetic laws 


The reader will already be familiar with the fact that if the arithmetic opera- 
tion of addition is performed on the natural numbers, or the positive integers 
as they are often called, the result will also be a positive integer. Written 
symbolically this statement becomes a, be N = (a + 6) ΕΝ. However the 
arithmetic operation of subtraction is less simple, since we know from direct 
experience that even when a,beN, this does not necessarily imply that 
a — bis a positive integer. Indeed, in general a — b may be equal to some 
positive or negative integer or to zero. 

Thus an attempt always to express the result of subtraction of natural 
numbers in terms of the natural numbers themselves must fail. This is usually 
expressed by saying that the system of natural numbers N is not closed with 
respect to subtraction. The difficulty is of course resolved by supplementing 
the set of natural numbers N by the set N* = {. . ., —3, —2, —1,0} of 
negative integers and zero. If now in place of N we use the complete set of 
integers I = N* UN already encountered in Problem 1!-l, the assertions 
a,bel=> (a+ b)elanda, ΕἸ = (a — b) ΕἸ become unconditionally true. 

The need to generalize the notion of the natural numbers N to the com- 
plete set of integers I is thus seen to arise as a natural result of seeking a 
number system in which the binary arithmetic operation inverse to addition 
is always true; namely the operation of subtraction. However, the set of 
numbers I is still far from adequate to enable everyday practical arithmetic 
to be performed. To see this it is only necessary to comment that although the 
product of two integers belonging to I itself lies in I, the quotient of two 
integers belonging to I does not necessarily lie in I. Thus the complete set of 
integers I is not closed with respect to division. Symbolically we can write 
this as a,be I= abel, but a,bel= αν εἰ only if δ 40 and a=kb 
with ΚΕἸ. The symbol + used here is to be read ‘not equal to’ and the condi- 
tion involving k simply ensures that the quotient a/b is integral. 

Here again the operations of multiplication and division are inverse 
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binary arithmetic operations. To remove the artificial restriction on division, 
so that the quotient of any two non-zero integers becomes a number in some 
number system, we must still further extend the system I of integers. This 1s 
achieved by introducing the familiar system R* of rational numbers, which 15 
defined as the set of all numbers of the form a/b, where ὁ τῷ 0 and a, be TI. 
Obviously, since integers are just a special case of rational numbers and, for 
example, 2 is represented by any of the rationals 2/1, 4/2, 10/5, . . ., the set 
R* also contains all the integers and so we may write I < R*. 


Numerous though the rational numbers obviously are, we now show how 
they may be arranged in a definite order and counted. One way in which this 
may be achieved is indicated in the following array which recognizes as 
different all rational representations in which cancelling of common factors 
has not been performed. Thus, for example, in this scheme 4/2, 6/3, 8/4,. . ., 
are counted as different rational numbers, despite the fact that they all 
represent 2. If desired these repetitions may be omitted from the resulting 
sequence of rational numbers, though the matter is not important. The 
counting or enumeration of the rationals proceeds in the order indicated by 
the arrows: 


0 
\ 
a ee ee Oe a | 
a Oe 2: Bo 
f | Ϊ { Τὺ 
2): 22 =) τ΄ 2: 2 2 
ἘΠ 2 1 od 
t | τύ 
3. 3 τ 8 1 σι 5 
3 “2-7 ΘΠ ἡ 


If this form of enumeration is adopted then the first few rationals to be 
specified are 


0, Ις 1/2, Ι, Zz —2, =15 — 1/2, =, =o) 2, —3; 


As already mentioned, if desired the repetitions may be deleted, so that the 
start of the sequence would then become 


Oo, 1, 1/2, 2, —2, —1, —I1/2, —3/2, —3, 


Clearly all rationals are included somewhere in this scheme, so that as 
each one may be put into correspondence with an integer, the mathematician 
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is entitled to say that the rationals are countable, despite the fact that they are 
infinite in number. What this construction has established is the rather 
remarkable result that the rationals are no more numerous than the set of 
positive integers themselves. 

It might, at first sight, seem that the rationals R* must contain all possible 
numbers. In fact this is far from the truth since it is possible to show that 
numbers exist which are not expressible as a rational fraction and yet which 
lie between two rationals, however close they may be. For obvious reasons 
they are called irrational numbers, and to substantiate our assertion we now 
prove the existence of one such number. 

We will show that 4/2 is irrational or, to phrase the statement more 
precisely, that there is no fraction of which the square is 2. The argument 
starts from a given assumption and then produces a contradiction, thereby 
showing that the original assumption must be false. It is called an argument 
by contradiction and is a device frequently used in higher mathematics. 


Suppose that m/n is such that m and n are integers having no common 
factor and (m/n)? = 2. Then m? = 2n2 so that m2 must be even and hence 
m itself'is even. Because m is even we may set m = 2r, where r is some integer. 
(Why ?) Then 4r? = 2n2, or 2r2 = πῶ, which now shows that n2 and hence n 
must be even. The fact that is even now allows us to set n = 2s and thus the 
numbers m and n have a common factor 2, contradicting the initial assump- 
tion. Hence the original assumption that 1/2 is capable of representation in 
the rational form m/n is false. We have thus proved that 4/2 is an irrational 
number. 


It is established in higher mathematics courses that the irrational numbers 
are so much more numerous than the rationals that they cannot be enumerated. 
We make no attempt to justify this claim here. Instead we refer the interested 
reader to Problems 1-32 to 1-35 if he wishes to gain a little more insight into 
the relationship that exists between the rationals and the irrationals. A final 
important result arising from a deeper study of these matters, and to which 
we make only a passing reference, is the fact that between them, the rational 
and the irrational numbers exhaust all the possible types of numbers. In 
effect this is saying that if we work with real rational and irrational numbers, 
then there are no gaps left in the number system that can only be filled by the 
introduction of yet another kind of number. This is important because it means 
that however we may arrive at a number, as the result of a finite or an infinite 
sequence of operations, it will either be a rational or an irrational number. 

If the set R* of rational numbers is supplemented by the inclusion of the 
irrational numbers, the resulting set R is called the real number ‘system or, 
the field of real numbers. The fact that R contains all possible types of real 
numbers is expressed by saying that the set of real numbers R is complete. 
Consequently, until we have occasion to consider entities such as 4/—1 
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there will be no need for us to work outside the real number system R. 
Numbers called transcendental numbers form an important subset of the 
irrational numbers. These are numbers like e and 7 which are not defined as 
the root of a polynomial with rational coefficients (cf. § 2-3). 
For future reference it will be useful to summarize the basic properties 
of the field of real numbers already known to the reader. We now do this 
making full use of the mathematical shorthand so far introduced. 


Additive properties 
ΑἹ a,b Ἐπ (a+b) &€R; R is closed with respect to addition. 
A2 abeR>a+b=b +4; addition is commutative. 
Α.3 a,b,cER=> (atb+c=at+(6+4 0c); addition is associative. 
A-4 For every a€R there exists a number 0€ R such that0 +a =a; 
there is a zero element in R. 
A:5 If aeR then there exists a number —ae R such that -a+a=Q; 
each number has a negative. 
Multiplicative properties 
ΜῈ a,be€R= abeR; R is closed with respect to multiplication. 
M:2 a,be R= ab = ba; multiplication is commutative. 
M:3 a,b,c€R = (ab)c = a(bc); multiplication is associative. 
M:4 There exists a number 1€R such that |.a=a for all εκ; 
there is a unit element in R. 


M:5 Letabea non-zero number in R, then there exists a number a-1 E R 
such that ala = 1; each non-zero number has an inverse. Usually 
we shall write 1/a in place of a}, so that the two expressions are 
to be taken as being synonymous. 

Distributive property 
D-1 a,b,ce R= a(b +c) = ab + ac; multiplication is distributive. 


The above results are self-evident for real numbers and are usually called 
the real number axioms. They are used by mathematicians as the logical basis 
for our number system. Later we shall encounter other systems of objects 
which, though sharing many of the properties of real numbers, are not them- 
selves numbers. For future reference we mention matrices, for which ΜῈ] to 
M:5 are not generally true, and vectors, for which two forms of multiplication 
exist and for which M-5 has no meaning. 

It is an immediate consequence of these axioms that commonplace 
arithmetic operations may be performed without question. For example, it is 
fundamental to arguments thata —-b =O0<a=b,andaé=an>E=y7 
if a + 0. These, and other elementary results of similar form, follow directly 
as a result of simple applications of the axioms. As it would be out of place 
to develop these ideas here we shall indicate the proof of just one such result, 
stating the others in the form of Problem 1-37 which is left to be attempted by 
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the reader who wishes to question further the basis of the real number system. 


We prove that there is a unique zero in R. The argument is again by con- 
tradiction, for we first suppose that two different zero elements exist and 
denote them by 0 and 0’. Then by A-4 it follows that 0 + 0’ = 0’ and 0’ + 0 
= 0, whence by A-2 we must have 0 = 0’, thereby establishing the uniqueness 
of the zero. 


So far our list of properties of real numbers has been concerned only with 
equalities. The valuable property of real numbers that they can be arranged 
according to size, or ordered, has so far been overlooked. It is of course this 
property that allows us to represent real numbers by points on a line and 
thereby to construct graphs and other valuable geometrical representations. 
Ordering is achieved by utilizing the concept ‘greater than’ which when used 
in the form ‘a greater than b’, is denoted by a > b. Hence to the other real 
number axioms must be added: 


Order properties 


ΟἹ If a@eR then exactly one of the following ts true; either a >> 0 or 
a=0Oor —a> 0. 

ΟΣ abeR,a>0b6>0>a+b6>0,and ab> 0. 
We now define a > ὃ and a < ὁ, the latter being read ‘a less than ῥ᾽, by 
a>b=»>a—b>0 and a<b>b—a>0. The following results are 
obvious consequences of the real number system and are called inequalities. 
In places they also involve the symbol => which 1s to be read ‘greater than or 
equal to’. 


Elementary inequalities in R 
I'l a>bandc>d=atc>bedd. 
12 a>b>Oandec >d>0= ac > bd. 
[13 k>Oanda>b= Κα kb. 
14 a>b=>-a<—b. 
[5 a<0,b6>0=2ab<0;a<0,56<05 ab>0. 
1-6 a>0>a'!t>0;a<0>a!<0. 
7 a>b>02b >a >0:a<b<0sb <a <0. 


An important use of inequalities is in defining intervals on a line and 
regions in a plane. Using the order property of numbers to associate numbers 
with points on a line, an interval on a line may be considered to be a segment 
of the line between two given points or numbers, a and ὁ, say. Three cases 
arise according as to whether (a) both end points are included in the interval, 
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A Dy a b 
CLES OE OC © ee OTe © 
a<x<b a<x<b 


(a) . (b) (c) 
Fig. 1:8 Intervals on a line: (a) closed interval ἃ < x < 6; (b) open interval 
a < x < b; (c) semi-open interval a< x < ὁ. 
(b) both end points are excluded from the interval, or (c) one is included and 
one is excluded. These are called, respectively, (a) a closed interval, (6) an 
open interval, (c) a semi-open interval. Namely, an interval is closed at an 
end which contains the end point, otherwise it is open at that end. In terms of 
the points a and ὁ and the variable x representing an arbitrary point on the 


line these are written: 


(a) a<j x <b; closed interval; 
(b) a << x <b; open interval; 
(Ὁ) a<x<bora< x - δ᾽ semi-open interval. 


Thus 1 < x <2 defines the semi-open interval containing the point x = 1 
and the points up to, but not including, x = 2. These are represented in 
Fig. 1-8 in which a solid line represents points in the interval, a circle represents 
an excluded point, and a dot an included point. 

Special cases occur when one or both of the end points of the interval 
are at infinity. The intervals —0o <x <aandb< x < o are called semi- 
infinite intervals and ~—0oo <x < © is an unbounded interval or, more 
simply, the complete real line. 

We illustrate the corresponding definition of a region in the (x, y)-plane 
by considering the three inequalities x? + y? < αὐ, y < x, x > 0. The first 
defines the interior of a circle of radius a centred on the origin, the second 
defines points below, but not on, the straight line y = x, and the third defines 
points in the right half of the (x, y)-plane including the points on the y-axis 


(a) (b) 
Fig. 1-9 Regions in plane: (a) region boundaries x? + y* = a”, y = x, and x = 0; 
(b) region x7 + yp? < ay <x, x > 0. 
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itself. These curves represent boundaries of the regions in question and the 
boundary points are only to be included in the region when possible equality 
is indicated by use of the signs > or <. The three regions are indicated in 
Fig. 1-9 (a) in which a full line indicates that points on it are to be included, a 
dotted line indicates that points on it are to be excluded, and shading indicates 
the side of the line on which the region in question must lie. Fig. 1-9 (b) 
indicates the region in which all the inequalities are satisfied. 

Simple inequalities of the form (x + 1)(x + 3) > (x — 1)(x — 2) also 
define intervals. For, clearing the brackets, we ‘see that x2 4+ 4x + 3 
> χϑ — 3x + 2 which, by simple application of the elementary inequalities 
just listed, reduces to x > —1/7 defining a semi-infinite interval, open at the 
end x = —1/7. 

The elementary inequalities may often be used to advantage to simplify 
complicated algebraic expressions by yielding helpful qualitative information 
as the following example indicates. 


Example 1.5 Prove that if a1, a2, . . ., dn and by, bo, . . ., bn are positive 
real numbers, then 


min () «- ὧι (2 ΓΤ" + an - max (=). 
w<r<n\Or] bi +bet:+>+bn” 1eren\br 


Here the left-hand side of the inequality is to be interpreted as meaning 
the minimum value of the expression (a,/br), with r assuming any of the 
integral values between | and n and the right-hand side is to be similarly 
interpreted reading maximum in place of minimum. The result follows by 
noticing that 


αι + az+:::+an Ι | (=) (=) (ΕἸ 
si es Sree ῆςς. be (ba ee ἡ 1} 
cay Sar are ye ae aaa Ὁ aa πα 


where > by = δι + δὲ τ + + + bn. For if each of the expressions (a1/b1), 


r=1 
(a2/b2), . . .. (dn/bn) is replaced by the smallest of these ratios, which could 


be the value taken by all the expressions if a1 = ag =+ + " = a, > 0 and 
by = bg = ++ + =bn > 0, then 
aytdg+*+++an (5) (δι + δὲ +++ + + dn) 
--------------- => mn t-—) | —_——— 
δι +be+:+++bn o<r<n\or ἐς Ρ 
r=1 : 
= min (Σ) 
ἥ ee br 


which is the left half of the inequality. The right half follows by identical 
reasoning if maximum is written in place of minimum. 
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14 Absolute value of a real number 


DEFINITION 1.5. The absolute value |a| of the real number a provides a 
measure of its size without regard to sign, and is defined as follows: 

la awhena>0 

4] => 

—awhena < 0. 

Thus if @ = 3, then ja} = 3 and if a = —5-6 then |a| = 5-6. 

There are three immediate consequences of this definition which we now 
enumerate as 


THEOREM [4 If a,5eR then 
(a) |ab| = |a| 6], 
(b) Ja + 6] < [α] + [A], 
(c) |a — δ] = [1α] — [d]]. 


The proof is simply a matter of enumerating the possible combinations 
of positive and negative a and b, and then making a direct application of the 
definition of the absolute value. We shall only illustrate the proof of (a). 

There are three cases to be considered; firstly a > 0, δ > 0, secondly 
a >0,b <0, and thirdly a< 0, ὁ >0. Ifa >0, δ᾽ > 0 then ab > 0 and-so 
|ab| = ab = [α] |b|. The second and third situations are essentially similar 
so we shall discuss only the second. As a > 0, b < 0 we have ab < 0, whence 
|ab]| = —ab = a(—b) = [α] [δ], establishing (a). For reasons we give later, 
result (b) is usually called the triangle inequality. 


The absolute value may also be used to define intervals since an expression 
of the form |a — x| > 2 implies two inequalities according as a — x is 
positive or negative. If a—x>O0 then ja — x] =a—x and we have 
a—x >2orx<a— 2. However ifa — x < 0, then by the definition of the 
absolute value of a — x we must have ja — x| = --ἰἝα — x) showing that 
—(a — x) >2, or, x >a+2. Taken together the results require that x 
may be equal to or greater than 2 + a or equal to or less than a — 2. x may 
not lie in the intervening interval of length 4 between x = a — 2andx =a - 2. 
This is illustrated in Fig. 1-10 (a) where a solid line is again used to indicate 
points in the interval satisfied by |a — x| > 2 and the dots are to be included 
in the appropriate intervals. 

By exactly similar reasoning we see that if we consider the inequality 
L< |x +1] <2, then if x +1>0, |x + 1] =x+1 and the inequality 
becomes 1 < x + 1<. 2. Hence the interval is Ὁ < x< 1. However, if 
x+1<0, then |x +1) =-—x-—J1 and so the inequality becomes 
1< —x —1< 2 giving rise to the interval —3 < x < —2. These intervals 
are shown in Fig. 1-10 (Ὁ) with circles indicating points excluded from the 
end of the solid line intervals and dots indicating points to be included. 
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ja—x|>2 b<jx+1/<2 


(a) (b) 


Fig. 1-10 Intervals on a line: (a) Ja — x] > 2; (b) 1 < [x + I] < 2. 


1-5 Representation of numbers 


The decimal representation of real numbers is usual in all ordinary arithmetic 
work and involves expressing a real number as the sum of an integral part 
and a decimal fraction. Each of the parts is represented as the sum of multiples 
of powers of 10, with the powers being positive integers or zero when repre- 
senting the integral part and negative integers when representing the decimal 
part. The number 10 that forms the basis of the decimal system is called the 
base of the number system. 
The integral part r of a finite real number « is thus expressible as 


r= an(10”) + AQn—1(10"-1) + oe + a,(10!) + ao(10°), 


where n is suitably chosen, and the coefficients a; are either zero or an integer 
between 1 and 9. Hence, in reality, the number 2049 1s a convenient representa- 
tion of 2(108) + 0(102) + 4(10!) + 9(10°), with the positions of the digits 
indicating the positive powers of 10 by which they are to be multiplied before 
addition. 

Similarly, if the decimal fraction part d of a real number « terminates 
after n decimal places, then it is expressible in the form 

by bo bn 

d= Τὸ <5 102 + Tr Tor’ 
with the coefficients δ᾽ again being either zero or an integer between | and 9. 
Hence the decimal number 0:3012 is, in reality, the representation of 

3 0 l 2 

10 Ὁ τοῦ * [01 7 TOF 
with the positions of the digits indicating the negative powers of 10 by 
which they are to be multiplied before addition. 

In general then, the decimal number that is written 


Amam—-1. . .a1d9 bybe. . . bn 
“(m+ ἢ digits —n digits” 
and which terminates after m decimal places, is the representation of 
am(10") + am—-1(10™—-1) ++ + - + 4([0}} 
be bn 


by 
0 Pe ee ie & be ἡ 9η: 
+ ao(10°) + 10 + 02 + + τ 
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Consideration of the representation of non-terminating decimal fractions 
and irrational numbers will be postponed until we discuss sequences and 
limits, since the approximation of real numbers by rationals has not yet been 
discussed. 

There is no reason why the base of the number system should not be any 
integer N > 1 and, indeed, in digital computing extensive use is made of the 
binary system. This is the system of representation using the base 2. Hence a 
binary number will contain only the digits | and O with their position indi- 
cating the power of 2 involved. Thus we may write | 


11 = 1(23) + 0(22) + 1(21) + 1(2%) 


so that the binary representation of 11 is 1011. Similarly, the rational number 
9/16 may be written 

9 1. 0. Ὁ I 

ΤΣ 8 oe 
showing that its binary fraction form is 0-1001. Hence the binary form of the 
number 114% becomes 1011-1001 and, as in the case of decimals, the position of 
a digit relative to the binary point indicates the power of two by which it is 
to be multiplied before addition. 

It is easily verified that the addition and multiplication tables for binary 
numbers are as illustrated in the following two tables: 


Binary Binary 
addition multiplication 
+ 0. x |}0 1 
0601 0;0 0 
1; 1 0 1 |0 1! 


Both tables are entered by selecting one digit in the first column and one 
in the first row, when the result of the operation appropriate to the table, 
namely addition or multiplication, is shown in the body of the table. For 
example, using the addition table and taking the digits | in the first column 
and O in the first row we see that 1 + 0 = 1. Similarly, taking the digits 
| in the first column and 1 in the first row we see that 1 + 1 = 0. The inter- 
pretation of this latter result is, of course, that a digit ] must be transferred to 
the next higher power of 2, corresponding to the transference of multiples of 
powers of 10 when performing ordinary addition. The multiplication table is 
straightforward and needs no further comment. 

The examples that now follow illustrate the addition, multiplication, and 
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subtraction of simple binary numbers. We shall let a = 12, b = 11 and form 
a + b, ab, and a — 6 using binary notation. The binary representations of a 
and ὃ are a = 1100, 6 = 1011 and so we have: 


Addition Multiplication 

L100 + 1100 x 
101! 1011 
1011] 1100000 
—— 11000 
1100 


10000100 


------ σα . 


Here the subscript 1 has been used to indicate the transference of a digit | 
corresponding to the result 1 + 1 = 0. 

The subtraction a — b is equally straightforward provided it is recalled 
that when the subtraction of digits 0 — 1 is encountered, it is necessary to 
‘borrow’ a digit 1 from the next higher position in the number ὁ. Thus the 
result would be to write 1 in place of 0 — 1 and to add 1 to the next higher 
position in ὁ. | 


Subtraction 


1100 — 
1011 


0001 


The expressions a + b, ab, and a — b for a = 12, b = 11 are thus 
a+ ὃ = 1(24) + 0(23) + 1(22) + 1(21) + 1029) = 23, 
ab = 1(2®) + 0(25) + 0(24) + 0(23) + 1(22) + 0(21) + 0(2°) = 132, 
a — δ = 0(28) + 0(22) + 0(21) + 1(2°) = 1. 


1:6 Mathematical induction 


Mathematical propositions often involve some fixed integer n, say, in a special 
role and it is desirable to infer the form taken by the proposition for arbitrary 
integral n from the form taken by it for the specific value n = π1. The logical 
method by which the proof of the general proposition, if true, may be estab- 
lished, is based on the properties of natural numbers and is called mathematical 
induction, 

In brief, it depends for its success on the obvious fact that if A is some set 
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of natural numbers and le A, then the statement that whenever integer 
né A, so also does its successor, implies that A = N, the set of natural 
numbers. 

The formal statement of the process of mathematical induction is expressed 
by the following theorem where, for simplicity, the mathematical proposition 
corresponding to integer n is denoted by S(x). 


THEOREM 1:5 (mathematical induction) If it can be shown that, 


(a) when ἢ = “1, the proposition S(nj) is true, 
and 
(Ὁ) if for ἢ = n,, when S(n) is true then so also is S(n + 1), 
then the proposition S(m) is true for all natural numbers n > ny. 
A simple oust auye example will help here and we now prove inductively 


that the sum Σ r of the first n natural numbers is given by n(i + )/2. In 
r=1 


other words, in this example the proposition denoted by S(n) is that the 
following result is true: 


1+2+::-:+n=n(1 +n)/2. 


Proof, step (a) First the proposition must be shown to be true for some 
specific value n = ny. Any integral value m will suffice but if we set m1 = 1 
the proposition corresponding to S(1) is immediately obvious. If, instead, we 
had chosen πὶ = 3, then it is easily verified that proposition S(3) is true, 
namely that 1 + 2 + 3 = 3(1 + 3)/2. 


Proof, step (Ὁ) We must now assume that proposition S(m) is true and 
attempt to show that this implies that the proposition S(m + 1) is true. If 
S(n) is true then 


14+24+--+:+n=n(1 4+n)/2 
and, adding (n + 1) to both sides, we obtain 
14+24+--'tut(atl=nit+n/24+m4+)) 
= (n + 1)(2 + n)/2. 
However, this is simply a statement of proposition S(m + 1) obtained by 
replacing n by n+ 1 in proposition S(n). Hence S(1) is true and δύ) 


= S(n + 1) so, by the conditions of Theorem 1-5, we have established that 
S(n) is valid for all n. 


Later we shall use this form of proof in cases less trivial than the above 
example which simply involved establishing the sum of an arithmetic progres- 
sion. 
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As another illustration of an inductive argument we now consider the 


determination of the mth term in the sequence of numbers wo, (1. Wz, . . ον 
defined sequentially by the equation 
Un = 2ZUn-1 + I. (1-17) 


Equations of this form which define a sequence of discrete numbers uw, 
are called first-order difference equations. It is clear that this difference 
equation provides us with the algebraic rule by which the nth term of the 
sequence may be computed once the first term uo has been specified. Generally 
speaking, any rule which specifies the form of computation to be pursued in 
order to arrive at the solution of a given problem is called an algorithm. 

A few moments’ experiment will suffice to convince the reader that the 
solution to Eqn (1-17) may be expressed in terms of uo by the equation 


Un = Zug + (2% — 1). (1-18) 


The initial term uo of the sequence is arbitrary and on account of this fact 
such a solution 15 called a general solution of the first-order difference equation 
(1-17). Once uo 15 specified by requiring that uo = C, say, then the solution is 
said to be a particular solution. 

The proof of Eqn (1:18) by induction again proceeds in two parts, with 
the proposition 501) being that Eqn (1-18) is the solution of Eqn (1:17). 


Proof, step{a) Ifm = 1, then μι = 2u9 + (2 — 1) = 2u0 4+ 1, showing that 
the proposition S(1) is true. 


Proof, step (b) Assuming the proposition S(7) is true, then 


2un + 1 = 2[2'μο + (27 — 1] +1 
= Qntlyg + (2:11 ἘΞ 1) 
= Un+1; 


showing that S(m) > δίῃ + 1). The result is thus true for all ἡ. 

To conclude this section, having introduced the notion of a difference 
equation let us take the concept a little further so that it can be used in more 
general circumstances. A homogeneous linear difference equation of order 2 
is a relationship of the form 


Uy Ἔ Qun-1 + bun—2 — 0, (1-19) 


where a and 6 are real constants and un-2, Un—1, Un are three consecutive 
members of a sequence of numbers. Given any two consecutive members in 
the sequence, say uo and uj, then Eqn (1-19) provides an algorithm by which 
any other member of the sequence may be computed. 

If we seek a solution uv», of the form 


uy,» = AA”, (1-20) 
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where A and Aare real constants, then substitution into Eqn (1:19) shows that 
R+ak+b=0. (1-21) 


This is called the characteristic equation associated with the difference 
equation ({-19) and shows that solutions of the form of Eqn (1-20) are only 
possible when A is equal to one of the two roots A; and 15 of Eqn (1-21), 
which we assume to be real numbers. If A, 4 Ag, then Ady” and BAg” are both 
solutions of Eqn (1-19) and it is easy to show that 


Un = Ady” + BAgn (1:22) 


is also a solution, where A and B are arbitrary real constants. This result is 
the general solution of Eqn (1:19). Given specific values for uo and μ1, 
A and B can be deduced by substituting into Eqn (1-22) and hence a particular 
solution found. 


Suppose, for example, that the difference equation was 
Un — Un-1 — Un-2 = O, 

and that uo = μι = 1. Then the characteristic equation is 
42-A—1=0, 


with the two roots A; = (1 + +/5)/2 and 42 = (1 — +/5)/2. Hence the general 
solution has the form 


eed (- ἢ +B (- Sy (1-23) 


To deduce the values of A and B particular to our problem we use the | 
initial conditions up = 1 and μι = 1 to deduce from Eqn (1-23) that 


1l=A+8B (case n = 0, up = 1) 
Ϊ 5 1.3 f/5 
1 = 4( oY) + 2/ | (casen = 1, uy = 1). 


Solving these equations for A and B we find 
avers Me eee: 
ΠῚ εἰ π - 


whence the particular solution 1s 


= (Ξ Ἔ Ἵ + vy’ Ε (Ξ - : ‘1 -- ΜΕ} 
seal (Ee IY 2 2/5 ( a ὦ 
The first few numbers wo, “1, v2, . . ., of the sequence generated by this 


algorithm are 


1.1.:.2.2.35..8..15..2}. 38. 5 εχ ας 
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and comprise the well-known Fibonacci sequence of numbers. This sequence 
of numbers occurs naturally in the study of regular solids and in numerous 
other parts of mathematics. Naturally if only the first few members of the 
sequence are required then they are most easily found by use of the algorithm 
itself, which in the form 


Un = Un—-1 + Un-2 


states that each member of the sequence is the sum of its two predecessors. 

It is not difficult to see that if the roots of the characteristic equation 
(1:21) are equal so that A) = Az = μ, say, then Aw” is a solution of Eqn (1-19). 
In terms of Eqn (1-19) this is equivalent to saying that a? = 4b and μ = —a/2. 
However Au” cannot be the general solution since it only involves one 
arbitrary constant A, and it is necessary to have two such constants in the 
general solution to allow the specification of the initial conditions uo and μι. 
The difficulty is easily resolved once we notice that nBu”, with B an arbitrary 
real constant, is also a solution of Eqn (1:19). This is easily verified by direct 
substitution. For then we have for the general solution in the case of equal 
roots in the characteristic equation, 


Un = (A + nB)y. (1-24) 


To illustrate this situation, suppose that we are required to solve the 
difference equation 


Un = 6Un—-1 — Fun-2 
subject to the initial conditions vo = 1, μι = 2. Then the characteristic equa- 
tion becomes 
λὲ - 6A +9 =0, 
with the double root 4 = 3. From Eqn (1-24) the general solution must thus be 
Un = (A + mB). 3”. 
Using the initial conditions wp = 1, μι = 3, then, shows that 
l=A and 2 = 3(A + B), 
so that the particular solution to the problem in question is 


Un = (1 — 4n)3”. 


PROBLEMS 


Section 1-1 
1-1 Enumerate the elements in the following sets in which I signifies the set of 
natural positive and negative integers including zero: 
(a) S={n\|nel, 5< nv? < 47); 
(0) S= {m8 |neEN, 15 < μπὲ < 40); 
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(c) S= ((m,n)|m,nel, 12 < m4 μϑ < 18); 
(d) S=(Gn,n,m+n)imneNn, 45 < n+ nr, 3<m+n< 9: 
(ec) S={x|xeEN, x2 4+ 01x -- ΤΊ = 0). 

1-2 Express the following sets in the notation of the previous question: 


(a) the set of positive integers whose cubes lie between 7 and 126; 
(b) the set of integers which are the squares of the integers lying between M 
and Ν (Ὁ - N< M); 
(c) the points in the plane that lie between circles of radii 1 and 3 drawn about 
the origin and which have x-coordinates greater than 0-5. 
1-3 Give an example of 


(a) a finite set having numerical elements, 
(b) a finite set having non-numerical elements, 


and in each case give an example of a proper subset. 


1-4 Give an example of 


(a) a set of ordered triples involving numerical quantities, 
(b) a set of ordered triples involving non-numerical quantities, 


and in each case give an example of a proper subset. 


1:5 State the relationships between the sets A and B if: 
(a) A=N, B= {Qn\neN}; 
(b) A= {sinx|x=(1 + 12nG7, neN}, B= {3}; 
(Ὁ A= t1,2,3,4, B= 5, 7,9, 11}. 


1-6 Form the union, intersection, and the complement of B relative to A of the 
sets A and B if: 
(a) A=N, B= ({2n|neEN}; 
(b) A = ta, ῥ, ς, 0, 2, 4), B= ide, f, 1, 3, 6, 7}; 
(c) A= Ul, ν2, 2,3, 5,6, B= 10, v2, v5). 


1-7 Construct Venn diagrams for the union and intersection of the sets A and B if: 


(a) A is the set of points interior to the unit square (that is, square having 
side of unit length) with one corner at the origin and lying entirely in the 
first quadrant, and B is the set of points exterior to the unit circle centred 
on the origin; 

(b) A is the set of points interior to the isosceles triangle of unit side with its 
centre of gravity at the origin and a side parallel to the x-axis, and B is 
the unit square having its centre at the origin and a side parallel to the 
X-axis. 

1-8 Represent by points on a graph the 36 possible outcomes of throwing two 
dice, each with faces numbered 1 to 6. Identify the set of points at which the 
sum of the scores on the two dice is greater than or equal to 7. 


1-9 By using Venn diagrams, prove Eqns (1:6) and (1:7) for sets which may be 
represented by points in the plane. 


1:10 Complete the details of the analytical proof of the first stated result of 
Theorem 1:1. 


1:11 lustrate by means of a Venn diagram the result A\(B ὦ C) = (A\B) U (A\C) 
of Theorem 1-1. 


1:12 The expression (A\B) ὦ (B\A) is called the symmetric difference of sets A and 
B. Illustrate the result by means of a Venn diagram and show that 
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(418) ὦ (B\A) = (AU B)\(A 2B). 


1-13 Prove analytically that A B = A\(A\B) and illustrate your result by means 
of a Venn diagram. 


1-14 In the following expressions, replace the symbol * by <, by > or by = to 
make them valid logical statements concerning the sets A, Band an element x: 
(a) χε * xXEAUB; 
(0) χεβ *« xE AUB; 
(c) xE€ A « XEANB; 
(ἃ xe€Aorxe BorxE€ ANB αὶ xE AUB; 
(0) xeAorxE BB, x€ANB αὶ xE(AU B)\(AN B). 
Give one example each of the use of > and <=. 


1:15 If * is a set operation and it is true that (A « B) * C = A κ(Βκ C), then the 
operation * is said to be associative. Use a Venn diagram to prove that 


(a) (AUB)JUCH=AVU(BUQSAVUBUC: 
(b) (AN B)NC=AN(BNOC)SANBNC. 


Section 1:2 
1:16 Toss a coin 50 times and plot the relative frequency of ‘heads’. 


1:17 Suggest a graphical representation for the sample space in which the outcome 
of tossing three coins might be recorded. 


1-18 Suggest a graphical representation for the sample space characterizing the 
score recorded in a trial involving the tossing of a die together with a coin 
which has faces numbered 1 and 2. Give examples of: 


(a) two disjoint subsets of the sample space; 
(b) two intersecting subsets of the sample space, indicating the points in their 
intersection. 


1:19 By using Egn (1-9) explain why 
P(A Ὁ B) = P(B) P(A | B) = P(A) P(B| A). 
Verify your result by computing P(A), P(B), P(A | B), P(B| A), and P(A 20 B) 
using the sets defined in connection with Fig. 1-6 (b). 


1-20 Use a Venn diagram to prove the generalized probability addition rule 
P(A YU BUC) = P(A) + P(B) + P(C) — P(AZ B) 
—-P(ANC)—P(BOC)+ PAN BOC). 


1-21 Use Theorem 1-2 to prove the generalized probability multiplication rule 
P(A Ὁ BAC) = P(A) P(B| A) P(C[ AB). 
1-22 Complete the argument in Example 1-2 (a). 


1:23 A bag contains 30 balls of which 5 are red and the remainder are black. A 
trial comprises drawing a ball from the bag at random, recording the result 
and then replacing the ball and shaking the bag. This process is called sampling 
with replacement. If this process is repeated twice, what is the probability of 
selecting 


(a) 2 red balls; 
(b) 2 black bails; 
(c) 1 red and 1 black ball? 
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1:24 By considering arrangements of the five letters 4, B, C, D, E verify that 


°P2 = 20 and (2) = 10. 
1:25 How many blends of coffee comprising equal quantities of 4 different types of 
coffee bean are possible if 9 different types of coffee bean are available. 


1:26 A game involves a team of 5 persons who play sequentially. How many 
different teams may be drawn up if 10 players are available. 


1-27 On the assumption that a participant in a raffle will buy either 2 or 4 numbered 
tickets, how many different sets of tickets may he choose from a book of 
20 tickets. 


1-28 A coin is biased so that the probability of ‘heads’ is 0-52. What is the prob- 
ability that: | 
(a) 3 heads will occur in 6 throws; 
(b) 3 or more heads will occur in 6 throws? 


1-29 Sheils fired from a gun have a probability 4 of hitting the target. What is the 
probability of missing the target if 4 shells are fired? | 


1-30 Draw the probability density function for the binomial distribution in which 
p =x and n= 6. Use your result to draw the corresponding cumulative 
distribution function. 


1-31 By considering Fig. 1-6 (a) deduce and draw the probability density function 
describing the sum of the scores on the two dice. 


Section 1:3 


1-32 Describe two different ways of defining N rational numbers between 1 and 2. 
Generalize one of these methods to interpolate N rationals between any two 
rationals a and ὁ. 


1:33 Working from the array of rational numbers given in Section 1-3, use arrows 
to suggest two alternative schemes to the one already described by which all 
the rational numbers may be enumerated. Is this array the only possible one 
that may be used ? If not, give an alternative. 


1:34 Use the fact that 1/2 is irrational to prove that if « is a rational number, then 
χα + 462, αν and 1/2/« are also irrational. Would the results still be true if 
\2 were replaced by any other irrational number, and would your proof 
still suffice ? 


[35 Prove that ν΄ 3 is irrational. (Hint: first assume that 1/3 is rational and equal 
to p/q, and then obtain a contradiction by considering even and odd values of 
4 separately.) 


1-36 The operation of division is defined in terms of multiplication as indicated in 
the following problem. The reader is required to provide the justification for 
some familiar arithmetic operations using only the operation of multiplication 
and the definition provided. Given that a and 6 are real numbers and that 
b Ὁ, we define a/b by k = a/b if, and only if, kb = a. Does this define a/b 
uniquely? Why is it necessary that b 4 0? Show that a/b = ca/cb whenever 
c ¥ 0 and that a/b + c/d = (ad + βοῶ) δά; (a/b)(c/d) = ac/bd; 1/(a/b) = μία 
(a #0). 


1:37 


1:39 


1:40 
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Prove the following statements concerning real numbers by directly applying 
the real number axioms: 


(a) There is just one zero element and one unit element; 
(b)at+o=aty> l=; 

(c) O.a=a.0=0; 

(4) af = ay and αὖ - ζὥ- ἢ; 

(e) (~a)b = a(—b) = —(ab); (—a)(—b) = ab: 

(f) ab=0>a=0 or b=0; 

(g) a(b — c) = ab — ac. 

The expression {a;}; , denotes the sequence of numbers a1, a2,. . ., an. 


Given that {a;}; , = 0-2, 3, 1-8, 2-2, 1, 3, 2 and {δι}... = 0-3, 2, 1-8, 1-1, 2, 
4, | verify the inequality of Example 1.5. 


Prove that if @ > 6b > O and k > 0 then 
b b+k 2 atk a 


a nee re ae 


Indicate by means of a diagram the intervals defined by the following expres- 
sions, using a dot to signify an end point belonging to an interval and a circle 
to indicate an end point excluded from the interval: 

(a) (x 2) + 3) < ( — D(& — 2); 

(bt) O< |x—3{] <1; 


Cy ee 2: 


1-4 


μαπὰ 


1:42 
1-43 


1-44 


1-45 


1:46 
1:47 


(d)0<|2x+1| <1; 
(e) [3χ +1| > 2; 
(oe x 
Pe ae 2Ἃα = I) 


Identify the regions in the (x, y)-plane determined by the following inequalities. 
Mark a boundary that belongs to the region by a full line; a boundary that 
does not by a dotted line; an end point that is included in an interval by a dot; 
an end point excluded from an interval by a circle: 

(a) x*+ y2< 1; x< 0; y< —x; 

(Ὁ) y<sinx; 2+ γ 5 πὸ; p< 3; 

(c) $x? + y? > 1; 7 


(ὦ y> xt: |x —1 oer. 


Give numerical examples to illustrate Theorem 1-4. 


Prove Theorem 1-4 (Ὁ) by considering separately the cases a 0, b - 0; 
ax0Ob6<0;a>0,5<0;a<0,56>0. 

Express these numbers in binary notation: 

(a) 27; (Ὁ) lve; (©) 286; (d) ἐξ. 

Express the following numbers in binary notation, and then use your results 


to form the expressions a + ὁ, a — δ, and ab. Check by interpreting the results 
in terms of the base 10: 


(4) a=12,b=11; (b) a= 3;'6,b =1; (c) ae b= Τὰ 
Give numerical examples to illustrate Theorem 1-4 using binary notation. 


Using the number system to base 3 and the digits 0, 1, 2 represent these 
numbers: 
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(a) 27; (b) 28; (ὦ xy; (ὦ ἐξ. 


1-48 Using the number system to base 3 and the digits 0, 1, 2 write out the addition 
and multiplication table for three digits analogous to those of Section 1.5. 


1-49 Express the following numbers in terms of the base 3 and use the tables of the 
previous problem to evaluate a + ὁ, a — ὃ, and ab. Check by re-interpreting 
your results in terms of the base 10. 


(a)a=4,b=24; (b)a=3,b=5; (Cc) a= ὅ, ὁ = 94. 
1-50 Give an inductive proof that 


Ἡ-- 
(a) Σ (α Ὁ να) τι 5 Rat (n— Id); 
r=0 


(Arithmetic Progression) 


ΩΣ ae n(n + ie + 1) 


1:51 The expansion 


(Sum of Squares) 


(n — 1) 


(a + b)® =a" + nab + = ee ee nab") - δ᾽, 


and the equivalent result 


(a+b = > " αρη-τ, 


γε 


are called the binomial expansion. Prove the result inductively for the case 
when n is a natural number. 


1-52 Give an inductive proof of the results 


— rn 


n—1 1 
a) > r= ᾿ 
s=0 


l—r 


ΩΣ 3= ee 2) 


1-53 Find the general solution to the difference equation 


(Geometric Progression) 


(Sum of Cubes) 


Un + Un-1 — 6Un-2 = 0, 


ben 
“ 
™~ 
-ς 
i] 
— 


Determine the particular solution corresponding to m1 
1-54 Find the general solution to the difference equation 
Un — 3un-1 + 2Uun-2 = 0, 
Determine the particular solution corresponding to μι = 3, uz = 7. 
1:55 Find the solution to the difference equation 
lin — 2Un-1 + Un-2 = 0 
given that μι = 2, v2 = 3; 
1-56 Find the general solution to the difference equation 


Un — 6Un-1 + 9Un-2 = 9. 


II 
— 
= 
NM 

Ι! 

| 
wa 


Determine the particular solution corresponding to mu 


Variables, functions, and 
mappings 


2.511 Variables and functions 


In the physical world the idea of one quantity depending on another is very 
familiar, a typical example being provided by the observed fact that the 
pressure of a fixed volume of gas depends on its temperature. This situation 
is reflected in mathematics by the notion of a function, which we shall now 
discuss in some detail. 

The modern definition of a function in the context of real numbers is 
that it is a relationship, usually a formula, by which a correspondence is 
established between two sets A and B of real numbers in such a manner 
that to each number in set A there corresponds only one number in set B. 
The set A of numbers is the domain of the function and the set B of numbers 
is the range of the function. 

If the function or rule by which the correspondence between numbers in 
sets A and B is established is denoted by /, and x denotes a typical number in 
the domain A of f, then the number in the range B to be associated with x 
by the function fis written f(x) and is read ‘f of x’. The numbers x and f(x) 
are variables with x being given the specific name independent variable and 
f(x) the name dependent variable. The independent variable is also often 
called the argument of the function f. 

It is often helpful to construct the graph of f which mathematically is 
the set of ordered number pairs (x, /(x)), where x belongs to the domain of δὲ 
Geometrically the graph of fis usually represented by a plane curve, drawn 
relative to an origin defined by the intersection of two perpendicular straight 
lines called axes. The process of construction is as follows. A distance propor- 
tional to x is measured along one axis and a distance proportional to f(x) 
along the other axis. Through each resulting point on an axis is then drawn a 
line parallel to the other axis and these two perpendicular lines intersect at a 
unique point in the plane of the axes. This point of intersection is the point 
(x, f(x)) and the graph of fis defined to be the locus or curve formed by 
joining up all such points corresponding to the domain of f, as illustrated 
by Fig. 2:1. 

However, it is not necessary to use axes of this type, called rectangular 
Cartesian axes, and any other geometrical representation which gives unique 
representation of the points (x, f(x)) would serve equally well. Thus the 
axes could be inclined at an angle « # ἐπ and the scale of measurement along 
them need not be uniform. For example, it is often useful to plot the logarithm 
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of x along the x-axis, rather than x itself. This compresses the x scale so that 
large values of x may be conveniently displayed on the graph together with 
small values. Another possible representation involves the use of curved 
reference axes and leads to curvilinear coordinates. This will be taken up 
again later in connection with conformal mapping. 

Not every function can be represented in the form of an unbroken curve, 
and the function 


0 when x is rational, 
70} = 1 when x is irrational, 


provides an extreme example of this situation. Here, although the graph 
would look like a line parallel to the x-axis on which all points have the 
value unity, in reality the infinity of points with rational x-coordinates would 
be missing since they lie on the x-axis itself. The domain is all the real numbers 
R and the range is just the two numbers zero and unity. 

Because f transforms one set of real numbers into another set of real 
numbers a function is sometimes spoken of as a transformation between 
sets of real numbers. On account of the restriction to real numbers or, more 
explicitly, to real variables, the function f(x) is called a function of one real 
variable. Another name that is often used for a function is a mapping of some 
set of real numbers into some other set of real numbers. This name is of course 
suggested by the geometrical illustration of the graph of a function and we 
shall return more than once to the notion of a mapping. In this terminology, 
f(x) is referred to as the image of x under the mapping 7. 

Since the domain and range of f occur as intervals on the x- and y-axes, 
it is convenient to use a simplified notation to identify the form of the interval 
that is involved. We now adopt the almost standard notation summarized 
below in which a round bracket indicates an open end of an interval, and a 
square bracket indicates a closed end of an interval: 


(a,b)<a<x<b, 
[α, ] a= x =, 
(a,b) a<x<b, 
fa,b)seax<x<b, 
(-- οὐ, 4] -᾿ χΧΞΞα, 
fa, Ο) -- α-Ξ2 χ, 
(-- οὐ, o)<=all xeR. 
As the definition of open and closed intervals is only a matter of considering 


the behaviour of the end points, we shall define the length of all the intervals 
(a, b), [a, b), (a, bl, and [a, b] to be the number ὁ — a. This is consistent with 
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the obvious result that the length of an ‘interval’ comprising only one point 
is Zero. 


Range of f 
ἜΝ 
a 

I 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 


Ax,) = fix,) 7 a ae ee 


Domain of f 


Fig. 21 Domain, range, and graph of f(x). 


It may happen that when x lies within some interval, as for example the 
interval (ὁ, c] in Fig. 2-1, each point x is associated with a unique image point 
J (x) and, conversely, each image point f(x) is associated with a unique point x. 
Such a mapping or function f is then said to be one-one in the domain in 
question. 

However, there is another possibility that can arise and that is that in some 
interval of the x-axis, more than one point x may correspond to the same 
image point f(x). This is again well illustrated by Fig. 2.1 if now we consider 
the interval [@, δ] and the points xz and x3, both of which have the same image 
point since f(x2) = f(xs). In situations such as these the mapping or function 
f if said to be many-one in the domain in question. 

A specific example might help here and we choose for f the function 
f(x) = x* and the two different domains [0,3] and [—1, 3]. A glance at 
Fig. 2:2 shows that f maps the domain [0, 3] onto the range [0, 9] one-one, 
but that it maps the domain [—1, 3] onto the same range [0, 9] many—one. 
Expressed another way, the range [0, 1] shown as a solid line in the figure is 
mapped twice by points in the domain [—1, 3]; once by points in the sub- 
domain —1 < x <0 and once by points in the sub-domain 0 < x < 1, 
Again considering the domain [—1, 3], the function f(x) = x? maps the sub- 
domain 1 < x < 3 onto the range (1, 9] one-one. 

In many older books the term function is used ambiguously in that it is 
sometimes applied to relationships which do not comply with our definition 
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Fig. 22 Example of many—one mapping in shaded range and a one-one mapping 
in the hatched range. 


of a function. The most familiar example of this is the ‘function’ y = 4/x, 
which fails to comply with our definition because to every positive x there 
correspond two values for y, namely the positive and negative square roots of 
x which are equal in magnitude but opposite in sign. A mapping of this kind 
is one-many in the sense that to one value of x there correspond more than 
one image point f(x), and although it is permissible to describe this relation- 
ship as a mapping, it is incorrect to term it a function. 

Nevertheless, the square root operation is fundamental to mathematics 
and we must find some way to make it and similar ones legitimate. The 
difficulty is easily resolved if we consider how the square root is used in 
applications. In point of fact two different relationships are always con- 
sidered which together are equivalent to y = +/x. These are yj = +4/x and 
y2= —~/x, where the square root is always to be understood to denote the 
positive square root and the sign identifies the relationship being considered. 
Each of the mappings yi(x) and yo(x) of the domain (0, 00) are one-one as 
Fig. 2:3 shows, so that they may each be correctly termed a function, the 
particular one to be used in any application being determined by other con- 
siderations, such as that the result must be positive or negative. These ideas 
will arise again later in connection with inverse functions. 
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Fig. 2.3 The square root function. 


In general, if the domain of function fis not specified then it is understood 
to be the largest interval on the x-axis for which the function is defined. So if 
f(x) = x? + 4, then as this is defined for all x, the largest possible domain 
must be (— οὐ, 00). Alternatively the function f(x) = +4+/(4 — x?) is only 
defined in terms of real numbers when —2 < x < 2 showing that the largest 
possible domain is [--2, 2]. Similarly, the function f(x) = 1/(1 — x) is 
defined for all x with the sole exception of x = | so that the largest possible 
domain is the entire x-axis with the single point x = 1 deleted from it. 

A function need not necessarily be defined for all real numbers on some 
interval and, as in probability theory, it is quite possible for the dependent 
and independent variables to assume only discrete values. Thus the rule which 
assigns to any positive integer n the number of positive integers whose squares 
are less than n, defines a perfectly good function. Denoting this function by 
f we have for its first few values f(1) = 0, f(2) = 1, f(3) = 1, ΚΦ = 1, 
7) = 2, f(6) = 2, f(7) = 2, f(8) = 2, f(9) = 2, f(10) = 3, . . .. Clearly, 
both its domain and its range are the set N of natural numbers and the 
mapping is obviously many-one. 

Before examining some specia] functions let us formulate our definition 
of a function in rather more general terms. This will be useful later since 
although in the above context the relationships discussed have always been 
between numbers, in future we shall establish relationships between quantities 
that are not simply real numbers. When we do so, it will be valuable if we 
can still utilize the notion of a function. This will occur, for example, when 
we establish correspondence between quantities called vectors which although 
obeying algebraic laws are not themselves real numbers. 

The idea of a relationship between arbitrary quantities is one which we 
have already started to examine in the previous chapter in connection with 
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‘set’ quite naturally when thinking of a 
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DEFINITION 2:1 A function f is a correspondence, often a formula, by 
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x = 2 shown dotted in Fig. 2-4 are called asymptotes to the graph of f and 
coinci 


although the graph approaches arbitrarily close to the asymptotes it never 
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Example 2:2 A discrete valued function may be defined by a table which is 
simply an arrangement of ordered number pairs in a sequence. 


Table 2-1 
x 0 1 3 7 


f(x) 2:1 4:2 1-0 6.3 


Example 2.3 One possible system of curvilinear coordinates in the first 
quadrant may be defined as follows. Using Cartesian coordinates, construct 
the set of curves y = a/x and the set of straight lines y = mx, each with 
domain (0, 0) and with a >0, m => 0. Representative examples of these 
curves are shown in Fig. 2:5 (a) for the stated values of a and m. 


3 
st 
is 


Tons 
Brisienronind 
Foret 


(a) (b) 


Fig. 2.5 (a) Families of curves y = a/x and y = mx; (Ὁ) curvilinear coordinates. 


In general, any set of curves such as either of these which is derivable 
from the same equation by a suitable choice of constant is called a family of 
curves, and the constant which is fixed for any one curve but which varies 
from curve to curve, is called a parameter. This term parameter will often be 
used in Contexts which do not involve families of curves, but in every case it 
will be used as here in the sense that it implies a ‘variable constant’. 

Next we disregard the Cartesian axes and the manner of construction of 
the two families of curves and regard the two families of curves themselves 
as defining new coordinate lines as in Fig. 2:5 (b). Each member of the family 
of rectangular hyperbolas will then define a line along which a is constant, 
no two members of the family either intersecting or having the same value of 
a. Similarly, each member of the family of straight lines through origin 0, 
collectively called a pencil of lines, is characterized by a different value of m. 
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Apart from the single point 0 through which all the straight lines pass and 
appropriately called a singular point, there is no ambiguity as to the values of 
a and m to be associated with any point in the region of the plane defined 
by the two families. We shall use the quantities a and mas our new coordinates 
for a point. Graphs may now be constructed using the two families of curves 
as curvilinear coordinates. The intersection of a hyperbola and straight line 
will define a point in the plane with coordinates given by the ordered number 
pair (a,m). Thus the points A, B, and C in Fig. 2:5 (Ὁ) have curvilinear 
coordinates (4, 1), (1, 4), and (2, 4), respectively. 

Naturally the graph of the function y = x? with domain (0, oo) would 
look different when plotted first in Cartesian coordinates and then in these 
curvilinear coordinates by setting a = x and m = y. They would however be 
two different geometrical representations of the same function. Here we have 
made use of the useful symbol =, which 1s read ‘identically equal to’. 


Example 2:4 This example is a final illustration of our more general defini- 
tion of a function. Take as the domain of the function f the set A of all 
people, and as the range B of the function / the set of all towns in the world. 
Then for the function f we propose the rule that assigns to every person his 
place of birth. 

Clearly this function defines a many—one mapping of set A onto set B, 
since although a person can only be born in one place, many other people 
may have the same place of birth. This example also serves to distinguish 
clearly between the concept of a ‘function’ which is the rule of assignment, 
and the concept of the ‘variables’ associated with the function which here are 
people and places. 


2.2 Inverse functions 


In the previous section we remarked that a typical example of a correspon- 
dence between physical quantities was the observed fact that the pressure of a 
fixed volume of gas depends on its temperature. Expressed in this form we are 
implying that the dependent variable is the pressure p and the independent 
variable is the temperature 7, so that the law relating pressure to temperature 
has the general form 


p= ¢(T), (A) 


where ¢ is some function that is determined by experiment. 

However, we know from experience that in thermodynamics it is often 
necessary to interchange these roles of dependent and independent variables 
and sometimes to regard the temperature T as the dependent variable and the 
pressure p as the independent variable, when the temperature—pressure law 
then has the form 


T= v(p), (B) 
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where, naturally, the function y is dependent on the form of the function φ. 
Indeed, formally, ¢ and y must obviously satisfy the identity ¢[y(p)] = p 
for all pressures p in the domain of y. 

The relationships (A) and (B) are particular cases of the notion of a 
function and its inverse and the idea is successful in this context because the 
correspondence between temperature and pressure is known to be one-one. 

Consider a general case of a function 


y=f) (2:1) 
that is one-one and defined on the domain [a, Ὁ], together with its inverse 
x = g(y) (22) 


which has for its domain the interval [c, d] on the y-axis. 


(b) 


Fig. 2-6 (a) Inversion through the graph of f(x); (b) inversion by reflection in yee. 


Graphically the process of inversion may be accomplished point by point 
as indicated in Fig. 2-6 (a). This amounts to selecting a point y in [c, d] and 
then finding the corresponding point x in [a, b] by projecting horizontally 
from y until the graph of fis intercepted, after which a projection is made 
vertically downwards from this intercept to identify the required point on the 
x-axis. 

The relationship between a function and its inverse is represented in 
Fig. 2-6 (b). In this diagram we have used the fact that when a function is 
represented as an ordered number pair, interchange of dependent and inde- 
pendent variables corresponds to interchange of numbers in the ordered 
number pair. The lower curve represents the function y = f(x) and the upper 
curve represents the function y = g(x), with the function g inverse to /; 
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both graphs being plotted using the same axes. The line y = x is also shown 
on the graph to emphasize that geometrically the relationship between a 
one-one function and its inverse is obtained by reflecting the graph of either 
function in a mirror held along the line y = x. Henceforth such a process will 
simply be termed reflection in a line. Notice that when using this reflection 
property to construct the graph of an inverse function from the graph of the 
function itself, both functions are represented with y plotted vertically and 
x plotted horizontally. This follows because the range of fis the domain of 
g, and vice versa. 

No difficulty can arise in connection with a function and its inverse 
because of the one-one nature of the mapping. Expressed more precisely, we 
have used the obvious property illustrated by Fig. 2-6 (a) that a one-one 
function f with domain [a, δ] is such that f(x1) = f(xe) > x1 = xe for all x1 
and xg in {a, δ]. 

In graphical terms this result can only be true if the graph of f either 
increases or decreases steadily as x increases from a to ὁ. When either of 
these properties is true of a function then it is said to be strictly monotonic. 
In particular, if a function f increases steadily as x increases from a to ὃ, as 
in Fig. 2-6 (a), then it is said to be strictly monotonic increasing and, conversely, 
if it decreases steadily then it is said to be strictly monotonic decreasing. 

Slightly less stringent than the condition of strict monotonicity is the 
condition that a function f be just monotonic. This is the requirement that f 
be either non-decreasing or non-increasing, so that it is permissible for a 
function that is only monotonic to remain constant throughout some part 
of its domain of definition. The adjectives increasing and decreasing are again 
used to qualify the noun monotonic in the obvious manner. Representative 
examples of monotonic and strictly monotonic functions, all with domain of 
definition [a, δ] are shown in Fig. 2:7. 


Decreasing Decreasing 


| 
| 
| 
Increasing! 


᾿ 


0 a b xr 0 a a bx 
(a) (b) 


Fig. 2.7 Monotonic and strictly monotonic functions: (a) monotonic; (b) strictly 
monotonic. 


The example of a strictly monotonic decreasing function shown in Fig. 
2:7 (Ὁ) has also been used to emphasize that a function need not be repre- 
sented by an unbroken curve. The curve has a break at the single point 


| 
| 
Increasing | 
| 


SEC 2:2 INVERSE FUNCTIONS / 51 


x = a where it is defined to have the value y = β. However, as the value β 
lies between the functional values on adjacent sides of x = « the function is 
still strictly monotonic decreasing. Had we set 6 = 0, say, then the function 
would be neither strictly monotonic nor even monotonic on account of this 
one point! | | 

It is sometimes useful to relate a function and its inverse by essentially 
the same symbol and this is usually accomplished by adding the superscript 
minus one to the function. Thus the function inverse to fis often denoted by 
J which is not, of course, to be misinterpreted to mean 1/f. Before examining 
some important special cases of inverse functions when many—one mappings 
are involved, let us formalize our previous arguments. 


DEFINITION2:2 Let the set onto which the one-one function f with domain 
[a, b] maps the set S of points be denoted by /(S). Then we define the inverse 
mapping f+ of f(S) onto S by the requirement that f-!(y) = x if and only if 
y = f(x) for all x in [a, δ]. 


It now only remains for us to consider how some important special func- 
tions such as y = x?, y = sin x, and y = cos x, together with other simple 
trigonometric functions which are all many-one mappings, may have un- 
ambiguous inverses defined. 

Firstly, as we have already seen, the function y = x? gives a many—one 
mapping of [—a, a] onto [0, a?]. Here the difficulty of defining an inverse is 
resolved by always taking the positive square root and defining two different 
inverse functions 


x= Ἔν" and x= ~—v/y, 


which are then both one-one mappings of (0, a2]. The inversion must thus 
be regarded as having given rise to two different functions; the one to be 
selected depending on other factors as mentioned in connection with Fig. 
2:3. If we recall that the domain of definition of a function forms an intrinsic 
part of the definition of that function, then y = x? may be regarded as two 
one-one mappings in accordance with the two inverses just introduced. 

This is achieved by defining the many-one function y = x? on the domain 
[—a, a] as the result of the two different one-one mappings 


y=x*?on-a<x<0 and y=x?on0<x<a, 


the difference here being only in the domains of definition. The point 0 is 
excluded from both domains since that single point maps one-one. By means 
of this device we may, in general, reduce many—one mappings to a set of 
one-one mappings so that the inversion problem is always straightforward. 
It will suffice to discuss in detail only the inversion of the sine function, 
after which a summary of the results for the other elementary trigonometric 
functions will be presented in the form of a table. In general, as shown in 
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Fig. 2-8 (a), the function y = sin x maps an argument x in the set R of real 
numbers onto [—1, 1] many-—one, but it maps any of the restricted domains 
[(2n — 1)4z, (2n + 1)42] corresponding to integral n onto [—1, 1] one-one. 


are sin Xx 


(a) b 
Fig. 2.8 Principal branch of sine function: (a) principal branch of ae giving 
one-one mapping in [-- ἐπ, $7]; (b) inversion of sin x by reflection in y = x. 


Now in line with our approach to the inverse of the square root function, 
the ambiguity as regards the function inverse to sine may be completely 
resolved if we consider the many-—one function y = sin x with x € R as being 
replaced by an infinity of one-one functions y = sin x, with domains 
[(2n — 1)2π, (2n + 1)}π]. For then in each domain corresponding to some 
integral value of n, because the mapping there is one-one, an appropriate 
inverse function may be defined without difficulty. 

The intervals are all of length w and are often said to define different 
branches of the inverse sine function. In general, when no specific interval is 
named we shall write x = Arcsin y, whenever y = sin x. The function 
Arcsine thus denotes an arbitrary branch of the inverse sine function. 
Because of the periodicity of the sine function, when considering the inverse 
function it is only necessary to study the behaviour of one branch of Arcsine. 
As is customary, we arbitrarily choose to work with the branch of the inverse 
sine function associated with the domain [— ἐπ, $7r], calling this the principal 
branch and denoting the inverse function associated with this branch by 
arcsine. Hence for the inverse we shall always write x = arcsin y when 
y=sinx and -- ἐπ -Ξ x¥< a. 

In Fig. 2-8 (b) is shown in relation to the line y = x the function y = sin x 
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with domain of definition [— ἐπ, ἐπ] and the associated function y = arcsin x 
with domain of definition [—1, 1]. The reflection property of inverse functions 
utilized in connection with Fig. 2:6 (Ὁ) is again apparent here. It should 
perhaps again be emphasized that when an inverse function is obtained by 
reflection in the line y = x, then in both the curves representing the function 
and its inverse, the variable y is plotted as ordinate (i.e. vertically) and the 
variable x as abscissa (i.e. horizontally). 

Table 2:2 summarizes information concerning the most important inverse 
trigonometric functions and should be studied in conjunction with Fig. 2:9. 
In general the notation for a function inverse to a named trigonometric 


X,=arcecosy, 


arctan x ϑ 


ie ps A 
(c) “ἝΝ ὼ 


Fig. 29 Principal branches of inverse cosine and tangent functions: (a) principal 
branch of cos x; (b) inversion of cos x by reflection in y = x; (c) principal branch of 
tan x; (d) inversion of tan x by reflection in y = x. 
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function is obtained by adding the prefix arc when referring to the principal 
branch and Arc otherwise. In other books the convention is often to add the 
superscript minus one after the named function, distinguishing the principal 
branch by use of an initial capital letter when writing the function. Thus, for 
example, some authors will write Sin~! in place of arcsine and sin-! in place 
of Arcsine. Unfortunately notations are not uniform here and so when using 
other books the reader would be well advised to check the notation in use. 


Table 22 Trigonometric functions and their inverse functions 


Function Domain Inverse function Branch Domain 
y =sinx [-- ἐπ, ἐπ) y ΞΞ arcsin x Principal [—1, 1] 
y =sinx [((Qn — 1)$7, (Qn + 1)47] y = Arcsin x Any [—1, 1] 
y = cos x [0, 7] y = arccos x Principal [—1, 1] 
y =cos x [nz, (n + 1)π] y = Arccos x Any [—1, 1] 
y =tanx (—4$7, $7) y = arctan x Principal (--Ῥ οὐ, w) 
y =tanx ((2n — 1)ὲπ, (2n + 1)$7) y = Arctan x Any (--- οὐ, 0) 


2.3 Some special functions 


A number of special types of function occur often enough to merit some 
comment. As the ideas involved in their definition are simple, a very brief 
description will suffice in all but a few cases. To clarify these descriptions, the 
functions are illustrated in Fig. 2-10. 


(a) Constant function 


The constant function is a function y = f(x) for which f(x) is identically 
equal to some constant value for all x in the domain of definition [a, δ]. 
Thus a constant function has the equation y =: constant, for x Ε [a, 5]. 


(b) Step function 

Consider some set of 1 sub-intervals or partitions [@o, a1), [@1, a2), [a2, a3), 
. . +, [Qn—1, Qn] of the interval [ao, an]. Associate m constants Ci, Co,. . ., Cn 
with these.1 sub-intervals. Then a step function defined on [ao, adn] is the 
function y = f(x) for which f(x) = Cy, for all x in the rth sub-interval. The 
function will be properly defined provided a functional value is assigned to 
all points x in [ao, an] including end points of the intervals. Usually it is 


Fig. 210 (opposite) Some special functions: (a) constant function; (Ὁ) step function; 
(c) y = |x|; (4) even function; (6) odd function; (f) bounded function on [a, ὁ]. 
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immaterial to which of two adjacent sub-intervals an end point is assigned 
and one possible assignment is indicated in Fig. 2-10 (b), where a deleted end 
point is shown as a circle and an included end point as a dot. 


(c) The function |x| 
From the definition of the absolute value of x it is easily seen that the graph 


of y = |x| has the form shown in Fig. 2:10 (c). It is composed of the line 
y = x for x > 0 and the line y = —x for x < 0. 


(d) Even function 


An even function y = f(x) is a function for which f(—x) = f(x). The geo- 
metrical implication of this definition is that the graph of an even function is 
symmetrical about the y-axis so that the graph for negative x is the reflection 
in the y-axis of the graph for positive x. Typical examples of even functions 
are y = cos x, y = 1/(1 + x?) and the function y = |x| just defined. 


(6) Odd function 


An odd function y = f(x) is a function for which f(—x) = —/f(x). The geo- 
metrical implication of this definition is that the graph of an odd function is 
obtained from its graph for positive x by first reflecting the graph in the 
y-axis and then reflecting the result in the x-axis. In Fig. 2-10 (e) the result of 
the first reflection is shown as a dotted curve and its reflection in the x-axis 
gives a second curve shown as a full line in the third quadrant which, to- 
gether with the original curve in the first quadrant, defines the odd function. 
By virtue of the definition we must have f/(0) = —/(0), showing that the 
graph of an odd function must pass through the origin. Typical odd functions 
are y = sin x and y = x3 — 3x. Most functions are neither even nor odd. 
For example, y = x3 — 3x + 1 is not even, since y(—x) = (—x)? — 3(—x) 
+1 = —x? + 3x + 1+ y(x), nor, by the same argument, is it odd, for 
γ(--α) ΓΗ —y(x). 


(f) Bounded function 


A function y = f(x) is said to be bounded on an interval if it is never 
larger than some value M and never smaller than some value m for all values 
of x in the interval. The numbers M and m are called, respectively, upper and 
lower bounds for the function f(x) on the interval in question. It may of 
course happen that only one of these conditions is true, and if it never exceeds 
ΟΜ then it is said to be bounded above, whereas if it is never less than m it is 
said to be bounded below. A bounded function is thus a function that is 
bounded both above and below. The bounds M and m need not be strict in 
the sense that the function ever actually attains them. Sometimes when the 
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bounds are strict they are only attained at an end point of the domain of 
definition of the function. 

Of all the possible upper bounds M that may be assigned to a function 
that is bounded above on some interval, there will be a smallest one Μ΄, say. 
Such a number M’ is called the least upper bound or the supremum of the 
function on the interval and the name is usually abbreviated to i.u.b. or to 
sup. Similarly, of all the possible lower bounds m that may be assigned to a 
function that is bounded below on some interval, there will be a largest one 
m’', say. Such a number m’ is called the greatest lower bound or the infimum 
of the function on the interval and the name is usually abbreviated to g.].b. 
or to inf. 

Not all functions are bounded either above or below, as evidenced by the 
function y = tan x on (-- ἰπ, 377), though it 1s bounded on any closed sub- 
interval not containing either end point. Typical examples of bounded func- 
tions on the interval (— οὐ, 00) are y = sin x and y = cos x/(1 + x?). The 
function y = 1/(x — 1) 1s bounded below by zero on the interval (1, 00) but 
is unbounded above, whereas the function y = 2 — x? 1s strictly bounded 
above by 2 but is unbounded below on the interval (— οὐ, οὐ). 


(g) Convex and concave functions 


A convex function is one which has the property that a chord joining any 
two points A and B on its graph always lies above the graph of the function 
contained between those two points. Similarly, a concave function is one 
which has the property that a chord joining any two points A and B on its 
graph always lies below the graph of the function contained between those 
two points. Thus the function y = |x| shown in Fig. 2:10 (c) is convex on the 
interval (— οὐ, 00) whereas the function shown in Fig. 2:10 (4) is only concave 
on the closed interval [—a, a]. 


(h) Polynomial and rational functions 


A polynomial of degree n is an algebraic expression of the form 
Y = Anx" + An-1x""1 + > + + ax + ao, 


where ἡ is a positive integer and it is defined for all x. 
A rational function is a function which is capable of expression as the 
quotient of two polynomials and so has the form 


= bnx™ + bym-sx™ 1 +--+ ++ bx + bo 
 dnX” + Gy 1Χ" +++ bax + ao_ 
and is defined for all values of x for which the denominator does not vanish. 


An example of a polynomial of degree 2 is the quadratic function 
y = x? — 3x + 4; a typical rational function is 
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— 3x2 — 2x — | 
} “ 4χϑ + llx? + 5x —2 


which is defined for all values of x apart from x = —2, x = —l,andx = }, 
at which points the denominator vanishes. For this reason these values are 
called the zeros of the polynomial forming the denominator and they arise 
directly from its factorization into the form 


4x3 4 1x2 + 5x — 2 = (4x — I(x + 2)Ἃἃ - 1). 


(i) Algebraic function 


An algebraic function arises when attempting to form the inverse of a rational 
function. The function y = + /x for x > Ὁ provides a typical example here. 
More complicated examples are the functions: 


y = x28 γε χε ναχ-} γτε χ ψ]Ω -- x). 


More precisely, we shall call the function y = f(x) algebraic if it may be 
transformed into a polynomial involving the two variables x and y, the 
highest powers of x and y both being greater than unity. This criterion may 
easily be applied to any of the above examples. In the case of the last example, 
a simple calculation soon shows that it is equivalent to the polynomial 
2γ3 — 2xy? — x3 = 0, which is of degree 2 in y and 3 in x. 


(j) Transcendental function 


A function is said to be transcendental] if it is not algebraic. A simple example 
is y = x + sin x, which is defined for all x but is obviously not algebraic. 


(k) The function [x] 


On occasions when working with quantities that may only assume integral 
values it is useful to write y = [x] with the meaning that we assign to every 
real number x the greatest integer y that is less than or equal to it. Thus, for 
example, we have [—3] = —3, [—1.3] = —2, [0] =0, [0.92] =0, 
[77] = 3, and [17} = 17. 


2:4 Digression on mappings 
Having now examined in some detail specific examples of functions providing 
one-one and many—one mappings, it will be helpful to take a slightly more 
general look at the notion of a mapping. We again appeal to the Venn 
diagram, but this time supplement it by the addition of arrows to suggest the 
form of mapping that is involved. 

In Fig. 2:11 pairs of closed curves have been used to represent the sets 
A and B postulated in the formulation of the more general definition of a 
function f given in Definition 2-1. Once again points inside a curve represent 
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elements in the set; with set A representing the domain of the function fand 
set B the range of αὶ The arrows relating sets A and B in the three pairs of 
diagrams are then self-explanatory when taken in conjunction with the 


: Cc 
Fig.2-11 Mappings: (a) B = f(A),a ieee mapping; (b) B = f(A), a many-one 
mapping; (c) B = f(A), a one-one mapping. 

The mappings illustrated in Fig. 2-11 are often said to be onto mappings, 
in the sense that the set 4 is mapped by function f onto the entirety of set B. 
Thus, in each case, every element in B is associated with at least one element 
in A. Naturally if some set C containing B is considered in place of B, then 
there will be elements of C that are not associated with any element in A. The 
mapping of A into C by fis then said to be an into mapping. 

For example, if the function concerned is y = x?, then it maps the set A 
comprising the interval [1, 2] into the set C comprising the interval [1, 9], 
but onto the set B comprising the interval [1, 4]. 

These ideas are of real importance when a double mapping is involved, 
for then it is necessary to examine the relationship that exists between the 
range of the first function and the domain of definition of the second. If the 
first mapping is by a function f and the second mapping is by a function g, 
then the result of the successive mappings is called the composition of fand g 
and is usually denoted by f, g. The order implies that f is the first mapping 
which is then followed by g. Using perhaps more familiar terminology and 
notation we are speaking here of the ‘function of a function’ g{ f(x)}. 

The general ideas involved here are illustrated in Fig. 2:12. There (a) and 
(b) indicate the respective domains and ranges of fand g whilst (c) indicates 
how, in general, the function f, g has for its domain only part of A and for 
its range only part of B. 
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SEO 


Domain of ἢ Range of f Domain of g Range of g 
(a) (b) 


Domain of fog f(a) Elements common to range gif(a} 
of f and domain of g 
(c) 


Fig. 212 (a) Mapping by ἢ of A onto B; (Ὁ) mapping by g of C onto δ; (c) com- 
position of ἔς g. 


The symbolic representation suggested in Fig. 2:12 can be made more 
meaningful by considering the following. Let f(x) = 3x + 1 with domain 
(— οὐ, 4/3] and g(x) = ++/(9 — x) with domain [I, 9]. Then the range of f 
is (— οὐ, 5] and the range of g is [0, 24/2]. The range of f thus only coincides 
with the domain of g in the interval [1, 5]. Hence the part of the domain of g 
that is common to the range of fis a one-one mapping by / of the interval 
(0, 4/3]. This interval must then be the domain of f, g. Next, the function g 
maps [1,5] onto the interval [2, 24/2], which must be the range of /, δ. 
Thus we have obtained the following: 


Domain of f:  (—, 4/3], 
Range of ἢ: (— οὐ, 5], 
Domain of g: [],9], 
Range of g: [O, 24/2], 
Domain of f, g: [0, 4/3], 
Range of fg: [2, 24/2]. 
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Using direct algebraic substitution we see that in fact if f(x) = 3x + 1 and 
g(x) = +9 — x), thenf, g = gf Ὁ} = V9 — Bx + ἢ] = +-V(8 — 30). 
This confirms directly that f, g maps [0, 4/3) onto [2, 24/2], but does not 
take explicit account of the effect of the domain of g on the mapping. 


2.5 Curves and parameters 


A parameter « may be associated with a curve in two quite different ways. 
In the first situation we shall discuss, the parameter « occurs as a constant 
in the equation describing the curve. Thus changing the value of « will change 
the curve that is described. This simple idea underlies the geometrical concept 
of an envelope, which will be taken up again later in connection with differen- 
tiation and with differential equations. 

In the second situation, « will appear as a variable associated with two 
functions s(«) and t(«), which will describe separately the x and y coordinates 
of points on any unbroken curve. This use of a parameter is called the 
parameterization of a curve and is an alternative method of representing the 
equation of the curve. 


(a) Envelopes 

This situation is best explained by means of an example. Considerthe equation 
on 2 

1+ a2 

which in this form is easily seen to describe a circle of radius |a|/4/(1 + «2) 


with its centre on the x-axis at the point x = «. Obviously, changing « will 
both move the centre of the circle and alter its radius, as shown in Fig. 2-13. 


(x — a)? + y? = 


Fig. 2.13 Envelope shown as dotted line. 
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If « is allowed to vary in some interval, then the single equation will 
describe a set of circles, each one corresponding to a different value assumed 
by « in that interval. Collectively these circles are a family of circles with 
parameter α. If a curve exists that is tangent to every member of a family of 
curves, but is not itself a member of the family, then it is an envelope of the 
family. An envelope can be a curve of infinite length or on occasions it may 
reduce either to a curve of finite length or, in degenerate cases, to a single 
point. 

In Fig. 2-13 the envelope is shown as a dotted curve and, as would be 
expected in this case, the envelope is symmetrical about both the x- and y-axes. 

If the family of circles that led to this envelope is written in the form 


“2 
a) 
1 + a2 


(ee) ye 


then it is seen to be a special case of an equation in three variables having the 
general form 


T(x, y, α) = 0. (2:3) 


This is the standard form for an equation defining a family of curves with 
parameter « and it will be used later to determine the equation of the envelope 
when it exists. 

However, it is easy to see that a family of curves does not always have an 
envelope associated with it, since the concentric circles x2 + y? = «2 forma 
perfectly good family with parameter «, but clearly there is no line that is 
tangent to each circle in the family. 

Expression (2:3) is an implicit representation of a function in the sense 
that it is not directly obvious how and when it is possible to re-express it in 
the more familiar explicit form 


y = F(x, &). (2-4) 


(b) Parameterization of a curve 


We have seen that when a curve is represented by an explicit equation of the 
form y = f(x), then for inversion reasons the mapping must be one-one. 
In other words, either f must be strictly monotonic in its domain of definition 
or, if not, it must be expressible piecewise as a set of new functions which are 
strictly monotonic on suitably chosen domains. 

A more general representation of a curve that overcomes the necessity 
for sub-division of the domain, and even allows curves with loops, may be 
achieved by the introduction of the notion of parametric representation of a 
curve. The idea here is simple and is that instead of considering x and y 
to be directly related by some function /, we instead consider x and y separ- 
ately to be functions of the variable parameter «. Thus we arrive at the pair of 
equations 
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x= s(x) y= 100), (2:5) 


with a< «<b, say, which together define a curve. For any value of « in 
[a, 5] we can use these equations to determine unique values of x and γ, 
and hence to plot a single point on the curve represented parametrically by 
Eqn (2:5). The set of all points described by Eqn (2-5) then defines a curve. 


As a simple example of a curve without loops we may consider the 
parametric equations 


yoo? x=a for—o<a< οὐ. 


These obviously define a parabola that lies in the upper half plane and is 
symmetrical about the y-axis with its vertex passing through the origin. 
Elimination of « is easy here and results in the explicit representation y= x’, 
In more complicated cases the parameter cannot usually be eliminated and, 
indeed, this should not be expected since parametric representation is more 
genera] than explicit representation. 

An important consequence of the parametric representation of a curve is 
that increasing the value of the parameter defines a sense of direction along 
the curve which is often very useful in more advanced applications of these 


ideas. An example of a curve containing a loop is provided by the parametric 
equations 


x=ai— a y=4-2? for —-2<a< 2, 


which is shown in Fig. 2-14 together with the sense of direction defined by 
increasing «. 
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It is implicit in the concept of the parametric representation of a curve 
that a given curve may be parameterized in more than one way. Hence 
changing the variable in a parameterization will give a different parametric 
representation of the same curve. Thus if in the example above we replace 
the parameter « by the parameter f using the relationship « = 6 + 1, then 
it is readily seen that 


x= P+ 3627+268 y=3— 26 — βὲ for —3< B< 1. 


This is an alternative parameterization of the same curve shown in Fig. 2:14. 


2.6 Functions of several real variables 


In physical situations, to say that a quantity depends only on one other 
quantity is usually a gross oversimplification. Indeed, this was so in the 
thermodynamic illustration used to introduce the notion of a function of one 
real variable, because we insisted on maintaining a constant volume of gas. 
In general the pressure p of a given gas will depend on both its temperature T 
and its volume v. Here we would say that there was a functional relationship 
between p, 7, and v which, in an implicit form, may be expressed by the 
equation 


I(p, T, v) = 0. (2-6) 


The function f occurring here is a function of three real variables and obviously 
depends for its form on the particular gas involved. 

Usually one of the three quantities, say p, is regarded as a dependent 
variable with the others, namely 7 and v, being regarded as independent 
variables. Solving Eqn (2-6) for p then gives rise to an explicit expression of 
the form 


P= »(Ι, v), (2:7) 


with g then being called a function of two real variables. 

Just as with a function of a single real variable, in addition to specifying 
the functional form it is also necessary to stipulate the domain of definition 
of the function. Thus Eqn (2-7), which in thermodynamic terms would be 
called the equation of state of the gas, would only be valid for some range of 
temperature and volume. In this case the reason for the restriction on the 
temperature and volume is a physical one, whereas in other situations it is 
likely to be a purely mathematical one. 

Extending the ideas already introduced we shall now let R? denote the 
set of all ordered pairs (x, y) of real numbers and let S be some subset of R?. 


DEFINITION 2:3 We say fis a real valued function of the real variables 
x and y defined in set S if, for every (x, y) € S, there is defined a real number 
denoted by f(x, y). 
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As is the case with a function of one variable, when the domain of defini- 
tion of a real valued function of two or more real variables is not specified 
it is to be understood to be the largest possible domain of definition that can 
be defined. Thus, for example, the largest subset S < R? in which the function 
f(x, y) = νᾷ — x? — γῆ) is defined is given by 


S = {(x, y) ε R2|x? + y? < 1}. 


This concept of a function immediately extends to include functions of 
- more than two variables. Using ΒΕ to denote the set of all ordered n-tuples 
(x1, X2,. . -, Xn) of real numbers of which S 15 some subset, this definition 
can be formulated. 


DEFINITION 2°4 We shall say that fis a real valued function of the real 
variables x1, X2,. . -, Xn defined in set S if, for every (x1, x2,. . .. Xn) ES, 
there is defined a real number denoted by f(x1, x2,. . ., Xn). 


A typical example of a function of the three variables x, y, z is provided 
by f(x, y,z) = νῶ -- x) + VO — vy?) + V(16 — z4). The largest subset 
S < R8 for which this function may be defined is obviously 


S={(x,y, JE R§|x< 2; -3< γκξΞ 3: --2ΣξΞ: 2:2. 


The geometrical idea underlying the graph of a function of a single variable 
also extends to real functions f of two real variables x, y. Denote the value of 
the function fat (x, y) by z, so that we may write z = f(x, y). Then with each 
point of the (x, y)-plane at which f is defined we have associated a third 
number z = f(x, y). Taking three mutually perpendicular straight lines with 
a common origin 0 as axes, we may then identify two of the axes with the 
independent variables x and y and the third with the dependent variable z. 
The ordered number triples (x, y, z) = (x, y, f(x, y)) may then be plotted as 
points in a three-dimensional geometrical space. The set of points (x, y, z) 
corresponding to the domain of definition of the function f(x, y, z) then define 
a surface which, in practice, usually turns out to be smooth. It is conventional 
to plot z vertically. 

On account of the geometrical representation just described, even in Ἐπ 
it is customary to speak of the ordered n-tuple of numbers (x1, x2,. . ., Xn) 
as defining a ‘point’ in the ‘space’ R”. 

By way of illustration of a graph of a function of two variables we now 
consider 

x2 y2 x2 2 

IBD Aa τς with 2 532 
where the inequality serves to define a domain of definition for the function. 
The surface described by this function has the equation z = x?/4 + y?/9 
and the domain of definition is the interior and boundary of the curve 


Cross-section by plane x =4 Cross-section by plane y =a 


Cross section by z= 1 


(b) 


Fig. 2:15 Surfaces and level curves: (a) representation of surface; (b) level curves. 
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x7/4 + y?/9 = 2. If this latter expression is rewritten in the form 
x?/8 + y?/18 = 1 then it can be seen that the domain of definition of fis in 
fact the interior of an ellipse in the (x, y)-plane having semi-minor axis 
24/2 and semi-major axis 34/2, and being centred on the origin. As f(x, y) is 
an essentially positive quantity it follows directly thatO < z< 2in the domain 
of f. 

To deduce the form of the surface, two further geometrical concepts are 
helpful. The first is the notion of the curve defined by taking a cross-section 
of the surface parallel to the z-axis. The second is the notion of a contour 
line or level curve, defined by taking a cross-section of the surface perpendicular 
to the z-axis. 

To examine a cross-section of the surface by the plane y = a, say, we 
need only set y = a in f(x, y) to obtain 2 = x2/4 4 a2/9, showing that the 
curve so defined is a parabola with vertex at a height z = a2/9 above the 
y-axis. A similar cross-section by the plane x = ἢ shows that the curve so 
defined is 2 = b?/4 + y2/9, which is also a parabola, but this time with its 
vertex at a height z = b?/4 above the x-axis. (See Fig. 2-15 (a).) If desired, 
sections by other planes parallel to the z-axis may also be used to assist 
visualization of the surface. 

The curve defined by a section of the surface resulting from a cross-section 
taken perpendicular to the z-axis is called a contour line or level curve by 
direct analogy with cartography, where such lines are drawn on a map to 
show contours of constant altitude. Level curves are obtained by determining 
the curves in the (x, y)-plane for which z = constant, and it is customary to 
draw them all on one graph in the (x, y)-plane with the appropriate value of 
z Shown against each curve. (See Fig. 2-15 (b).) 

Let us determine the level curve in our example corresponding to z = ᾧ 
which is representative of z in the range 0 --Ξ z< 2. We must thus find the 
curve with the equation x?/4 + y?/9 = 4, which we choose to rewrite in the 
standard form x?/2 + y#/(9/2) = 1. This shows that it describes an ellipse 
centred on the origin with semi-minor axis 1/2 and semi-major axis 34/2. 
It is not difficult to see that all the level curves are ellipses; the one corres- 
ponding to z = 2 being the boundary of the domain of f and the one corres- 
ponding to z = Ὁ degenerating to the single point at the origin. 


PROBLEMS 


Section 2:1 
211 Sketch the graphs of these functions: 
(a) f(xy) = x27 -- 3χ- 2 (-l<x <3); 
(Ὁ) fix) =x +sinx (—2/2 <x < 2/2); 
(c) f(xy) = x8 (-2 <x <2); 
(4) f(x) = x? + 1/x (02 <x <2); 
(ὁ) ) =x+ 1x2? (0.5 <x <5). 
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2:2 Determine the domain and the range of each of functions (a) to (e) defined 
above. 


2:3 Determine the range of the function f(x) = x® + 1 corresponding to each of 
the following domains and state when the mapping is one-one and when it is 
many-—one: 


(a) [-1,1]; (Ὁ) (2,4); © [-2,4]; (ὦ [—3, 1} 


2:4 Find the largest domain of definition for each of the following functions: 


(a) f(x) = x3 + 3; (b) fo) = x? + VU — x2); 
(c) f(x) = χ + VU — x4); (d) f(x) = 1/@? -- 1); 
(e) f(x) = x + I/x; (Ὁ f(x) = x?/(1 + x). 


2:5 Let f(n) denote the function that assigns to any positive integer n the number of 
positive integers whose square is less than or equal to n + 2. By enumerating 
the first few values of f(n) deduce the values of n for which Κη) = 3. 


2:6 An integer m is said to be a prime number if its only factors are 1 and m. 
Given that f(m) is the function that associates with n the number of primes less 
than or equal to 2m + 1, enumerate the first ten values of ἔζη. 


2:7 Give two examples of functions which are defined only for discrete values of 
the dependent and independent variables. 


2:8 Sketch representative members of the two pencils of lines described by 
y=ax—1)+2 and y = B(x — 2) + 3, where « and f are parameters. 
Locate the two singular points and suggest how « and β may be used as co- 
ordinates for points in the plane of the two pencils. When will the coordinates 
a and fail to identify points? 


2:9 Suppose that fis the function that assigns to every qualified driver the name of 
the driving examiner who issued his licence. Identify the domain A and the 
range B of f, stating the nature of the mapping involved. 


2:10 Give two examples of functions relating non-numerical quantities. 


Section 2:2 


2:11 Sketch the graphs of the following functions in their stated domains of 
definition and in each case use the process of reflection in the line y = x to 
construct the graph of the inverse function: 

(a) f(x) = x3 with x € [—2, 2]; 
(b) f(x) = x + sin x with x é [0, m2); 
(c) f(x) = x/(1 + x?) with xe [—], 2]. 


2:12 Where appropriate, classify the following functions as either monotonic or 
strictly monotonic increasing or decreasing on the stated domains of defini- 
tion: 

(a) f(x) = x? for x e [—1, 2]; 

(Ὁ) f(x) = x? for x e[—1, 0); 

(c) f(x) = sin x for x € [—32/4, 2/4]; 

(d) f(x) = cos x for x € [0, =]; 

(6) f(x) = tan x for x € [— 2/4, 7/4]; 
x for x € [0, 1] 

(Ὁ f(x) = {1 for x ed, 2] 
x2/4 for x € (2, 6]; 
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x for x € [l, 2) 


(g) f@) = "ἢ for x € [2, 4]. 


2:13 Complete the entries in this table: 


a ͵ 1.55.0... ....---...ς-ς.ς.---ς-..Ὁ-...Ἐ-τ  “-ςὐ-«.χς..--Κ» ς..Ἕ.:.-- βὕΟο..-.--. 


f 5 oC το τ΄ 
x [—3, !] 
x3 (2, 4] 
1/1 + x) [!, 3] 
sin x [—37, tz] 
cos (x + 47) [0, 7] 
tan [x — i] [0, ἐπ] 


EN eee ee ττττττπἕρσρ“ρ“ὍᾶἕΔβΨἕΨὲΕΒἋἩΨἘἜΨ “ἝὍοιρ;} ο“ονυοἝ.Ὕο»" .““νρτρ“πττρρνὕ. νϑὕὕἑ -΄“---- ὩΣ 


Section 2:3 
7.14 Sketch these functions in their associated domains of definition: 
(a) f(x) = | 2x| for x € [—2, 2]; 
(b) f(x) = x + | x | for x ¢ [—2, 2]; 
(c) the step function assuming the values 1, 2, —3, 2, 4 on the x intervals 
(0, 1), [1, 2], (2, 3°5), [3°5, 4], and (4, 5], respectively. Identify end points 
belonging to a line by a dot and end points deleted from a line by a circle. 
| x | for x € [0, 1) 
(d) f(x) = {| x — 1] for xe [1, 2) 
| x — 2] for xe [2, 3). 


2:15 Where appropriate, classify the following functions as even or odd: 
(a) f(y) = x+ [x]; Ἢ 
(Ὁ) f(x) = x + sin 2x; 
(c) f(x) = x? + sin x; 
(d) f(x) = 1/x; 
(e) f(x) = x?2/(1 + x?)*; 
(Ὁ f(x) = χὃ — x8 + x; 
(g) f(x) = 2 cos x + sin x. 


It is obvious that any arbitrary function f(x) which is defined in an interval 
JF containing the origin may be written in the form 


fxs) = 3 τ f(—») + ἐς Ὁ) — fr»), 
in any interval Y <#% that is symmetric about the origin. Such an interval J 
is said to be interior to.“. This shows that any such f(x) is expressible as the 
sum of an even function 3(f(x) + f(—~)), and an odd function 4(f(x) — f(—)) 
within Y. Apply this result to display the following functions as the sum of 
even and odd parts, in each case stating the largest interval Y for which the 
result is true: 
(h) f(y) = 14+ x3+ x sin x for —27 < x < 3m; 
(Ὁ f(xy) = 1+%x*+ |x| sinx for —37 <x < 37; 
Gj) fy =1l-—xt 2x2 + 4x8 for —4 <x <3. 


2:16 Determine if upper and lower bounds exist for the following functions and, 
when appropriate, state their values and where they occur on the respective 
domains of definition: 
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2:18 


2:19 


(a) f(x) = 1/x for x εἰ], 4]; 

(b) f(x) = 1/x for x € (0, 3]; 

(c) f(x) = 1+ x? for x e [—2, 1]; 
(d) f(x) = sin x for x € [0, 37/2]; 

(6) f(x) = tan x for x € (— 2/2, 7/2). 


The pairs of numbers enclosed by the curly brackets following each problem 
are upper and lower bounds for the associated function in its stated domain 
of definition. State whether or not each of these bounds is strict: 

(a) f(x) = x8 + x + 1 with x ¢ [1, 2], {0, 11}; 

(b) f(x) = sin x with x € [0, 7/2], {0, 2}; 

(c) f(x) = 1/0. + x?) with x € [0, 2], {1/6, 2}; 

(d) f(x) = sin (1/x) with x ε [2/7, 30], {0, 1}. 


Determine by sketching whether the following functions are convex, concave 
or neither on their stated domains of definition: 

(a) f(x) = x? for xe [l, 3]; 

(b) f(x) = x for x e [—1, 1]; 

(c) f(x) = a® — x? for x € [—a/2, a]; 

(4) f(x) = x + sin x for x ε [0, 7/2]; 

(6) f(x) = sin x for x € [0, 7]. 


Give examples of polynomials of degrees 3, 4, and 5 and of a rational function 
having a numerator of degree 2 and a denominator of degree 5. 


2:20 Classify the following functions as polynomial, rational, or algebraic. When 
the function is algebraic, state the degrees of x and y in the polynomial that is 
involved after the surds and fractions have been cleared: 

(a) y= x3 — x2 +1; 
(0) y = χξν — x); 
(c) y = (x — 1/4 + 3x8 — x2 + x + 1); 
(4) y= x t+ 3V (xX? — 2); 
(e) y = (x3 — 3x + 2)/(x — 1). 
Section 2°4 
2:21 Complete the entries in the following table by determining whether the 


functions f map the stated domains A ‘into’ or ‘onto’ the domains B. 


Into or onto 


f e 2 mapping 
x3 (1, 3} [0, 30] 
x + sin x [0, ἐπ] [0, (2 + =)] 
x? (1, 4] [1, 16] 
x4 [—1, 2] [0, 16] 


2-22 Given that f(x) = 2x — 7 with domain (— ©, 20] and g(x) = 10 — x with 


domain [—6, ©), determine the domain and the range of the composition 


fog. 


2:23 Given that f(x) = x-+ 1 with domain (— οὐ, 99) and g(x) = 2+ V(4— x) 


with domain [—5, 4], determine the domain and the range of the composition 
fog. 
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Section 2:5 


2:24 Draw the circles corresponding to « = 4, 4, 3, 1, and 2 in the equation 
(x — 1)? + Ὁ — α)Ζ = αὐ! + «) and sketch the envelope indicating its 
asymptotes for large positive and negative x. 


2:25 Draw the circles corresponding to « = j, 3, 1, 2, and 3 in the equation 
(x — a)? + y? = 3a? and draw the envelope. 


2:26 Deduce the envelope of the family of circles (x — «)? + y? = «?, with 
parameter α. 


2:27 Sketch representative ellipses belonging to the family x?/«? + y?/(4 — «)? = 1, 
with parameter « and deduce the shape of the envelope. 


2:28 Draw representative members of the family of straight lines y = ax + 2/a, 
with parameter α, and deduce the shape of the envelope. 


2.29 Sketch the curve represented by the parametric equations x = 2 cos ἃ, 
y = sin « for —~7/2 <a < π|2. 


2:30 Sketch the curve represented by the parametric equations x = a? + l,y = «8 
for —2 <a <2. | 


2.31 Sketch the curve represented by the parametric equations x = a3 + «2 — 2a, 
y=5— α for —3 < « < 2. Indicate by arrows on the curve the sense of 
direction corresponding to increasing «. 


2:32 Sketch the curve represented by the parametric equations x = cos « + 
4 cos («/3), y = sine + 4sin («/3) for 0 << a <3n/2. Use arguments in- 
volving even and odd functions to deduce the form taken by the curve for 
0O< « < 67, 


2°33 Suggest two different parametric representations for the curve y = x? + x + 1 
forO0 <x <2. 


Section 2.6 


2:34 What are the largest domains of definition for the following functions of 
several variables: 


(a) f(x, y) = 1 + x2 + y?; 

(b) fix, y) = G2 + γίναᾳ — x? — γῇ); 

(c) f(x, y) = sin xy/(x? + y? + 1); 

(ὦ fix, y) = 3x2 + y? + V2 —y) + νά -- 2%); 

(e) Κα, ν, 2) = VB -- x») +xVO9—- y) + γνᾷ — 22); 
(Ὁ fx, y, 2) = νὰ + y? — 1) + V4 -- 8 -- yp? = 23). 


2°35 The function f(x, y) = x®y has for its domain of definition the rectangle in 
the (x, y)-plane defined by | x | < 3, | y| < 2. Deduce the shape of the curves 
defined by cross-sections of the surface z = f(x, y) taken by the three planes 
x = —2, x = 0, and x = 2 that are parallel to the (y, z)-axes and by the three 
planes y = —2, y = 0, and y = 2 that are parallel to the (x, z)-axes, using 
your results to sketch the surface. Sketch on one diagram the level curves 
corresponding to z = —4, z = —2, z = 0, and z = 6, 


2:36 Sketch the surface z = f(x, y) defined by the function f(x, y) = 1/(1 + x? 
+ y?) in the domain | x | < 4, | y| <4. Draw the level curves corresponding 
toz = 1/9, z = 1/3, z = 2/3, and z = 1. 
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2:37 The surface z = f(x, y) is defined by the function f(x, y) = 1/[(x — 1)? 
+ (vy — 2)? — 1] with 2 < ἃ — 1)? + (y — 2)? < 9. Deduce the domain of 
definition of the function and then sketch the level curves corresponding to 
z = 4, z = 4, and z = ὃ on the same diagram. Use your result to sketch the 
surface. (Hint: Use the fact that the circle of radius p with centre at (a, 5) has 
the equation (x — a)? + (Ὁ — δ) = ρ3} 


Sequences, limits, and 
continuity 


3:1 Sequences 


The notion of a ‘sequence’ is a constantly recurring one in everyday life, 
where it usually implies the ordering of some set of events with respect to 
time. The sets of events that are so ordered, or arranged, are very varied and 
may be either numerical or non-numerical in nature. Typical examples of 
commonplace sequences in these categories are these: 


(a) the sequence of months in a year; 

(b) the sequence of digits identifying a telephone subscriber; 

(c) the sequence of machining operations required to make a certain 
component. 


However, sequences are not necessarily decided by the chronological 
order of events and they are often determined instead by some attribute 
possessed by the members of the set to be ordered. Thus, for example, two 
commonly occurring sequences to be found in any library are the entries in the 
alphabetic catalogues of authors and titles, neither of which are in the 
chronological order of acquisition of the books. Although these general ideas 
could be discussed at greater length, such an examination is inappropriate 
here, and it must suffice that these few examples show that sequences are 
commonplace in the world around us, and that they need not necessarily 
involve numbers. 

These ideas find an immediate parallel in mathematics, where the natural 
order existing in R combined with the arithmetic properties discussed in 
Chapter 1 enables us to deal very successfully and in great detail with ques- 
tions relating to mathematical sequences. Our main pre-occupation in this 
book will be with sequences of numbers and sequences of functions so we 
must first make the mathematical notion of a sequence more precise. Before 
doing this however we must first issue a word of warning concerning the 
colloquial usage of the words sequence and series, and on their mathematical 
usage which is quite different. Colloquially the words sequence and series are 
often used interchangeably, but in mathematics they have two quite different 
meanings which must never be confused. In brief, in mathematical terms a 
sequence is a set of quantities that is enumerated in a definite order, whereas a 
series involves the sum of a set of quantities. Thus 1, 3, 5, 7,9,...isa 
sequence but 1 - ὁ Ἔ ἢ Ἡ ὁ Ἔ ὐἦξ ἘΠ [5 ἃ series. 

If a sequence is composed of elements or terms u belonging to some set S, 
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then it 15 conventional to indicate their order by adding a numerical suffix 
to each term. Consecutive terms in the sequence are usually numbered 
sequentially, starting from unity, so that the first few terms of a sequence 
involving u would be denoted by wi, ue, us, .. .. Rather than write out a 
number of terms in this manner this sequence is often represented by {un}, 
where up is the nth term of the sequence. The sequence depends on the set 
chosen for S and the way suffixes are allocated to elements of S. A sequence 
will be said to be infinite or finite according as the number of terms it contains 
is infinite or finite and, unless explicitly stated, all sequences will be assumed 
to be infinite. The notation for a sequence is often modified to {u,}*_, when 
only a finite number WN of terms is involved. 

As an example of an infinite numerical sequence, let S be the set of real 
numbers and the rule by which suffixes are allocated be that to each integer 
suffix n we allocate the number 1/2” which belongs to R. We thus arrive at the 


infinite sequence uy = 1/2, ug = 1/22, ug = 1/23, . . ., which could either be 
written in the form 
11 1 1 1 


. . . — ——$$—$$—_——— ° e ° 


or, more concisely, in the form 


{1/2}. 


Had the set S still been the set R of real numbers, but the rule of allocation 
of suffixes been changed, so that to each integer suffix n chosen from the first 
N natural numbers we allocated the number 1/(2” + 1), then the finite 
sequence 


bd. I 
"QN + 1) 


would have resulted. 
If we use the notion of a function f(x) which is defined only for integral 
values of the argument x, the following concise definition can be formulated. 


DEFINITION 3:1 In mathematical terms a sequence is a function f defined 
only for integer values of its argument and having for its range an arbitrary 
set δ. 


Hence the first sequence that was displayed could be regarded as resulting 
from the function f(x) = 1/27 with uz, = f(n), where n is always a positive 
integer. By exactly similar reasoning, the second sequence can be derived 
from the function f(x) = 1/(2x + 1) by setting un = f(n). 

The connection between functions and sequences that is established in 
this definition makes it appropriate to describe numerical sequences in the 
same terms as would be used to describe the function giving rise to them. 
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Thus if the terms of a sequence {u»} are such that m < un < M for all values 
of n then the sequence is said to be bounded, whilst if uni > un for all n 
then the sequence is said to be strictly monotonic increasing. The terms bounded 
above, bounded below, unbounded, strictly monotonic decreasing, monotonic, 
and oscillating, etc., can also be used in the obvious manner as shown below. 


Example 3-1 


(a) {||} is a bounded, strictly monotonic decreasing sequence. 
The upper bound 1 is strict but the lower bound 0 is 
never actually attained. 


(b) 1 ° 1s a strictly monotonic increasing sequence, strictly 
sin (1/n) bounded below by (sin 1)-1 but unbounded above. 


(c) [ΞΞ]: is a bounded sequence with strict upper bound } and 


1 


n |, strict lower bound —1. 


(d) {tn}? where wem-1 = m/(m + 1) and μην = uam—i1. The first 
six terms of this sequence are 4, 4, ἢ, 4, 3, ? correspond- 
ing pairwise, respectively, to m= 1, 2, and 3. The 
sequence is thus both bounded and monotonic in- 
creasing. It is not strictly monotonic increasing because 
pairs of terms are equal. The lower bound 3 is strict, 
but the upper bound | is never actually attained. 


(e) {(—1)"} is an oscillating but bounded sequence with strict 
upper bound 1 and strict lower bound —1. 


(Ὁ) {(—2)"} is an oscillating but unbounded sequence. 


Just as a graph proved to be useful when representing functions, so also 
may it be used to represent sequences. Exactly the same method of repre- 
sentation can be adopted, but this time, since the domain of the function 
defining the sequence is the set of natural numbers, the graph of a sequence 
will be a set of isolated points. A typical example is the graph of the first 
few terms of the sequence {un} with up» = [n + (—1)"]/n which are shown as 
dots in Fig. 3-1 (a). 

An obvious deficiency of this representation is that the horizontal axis 
must be made unreasonably long if a large number of terms are to be repre- 
sented. This can be overcome by the following simple device which is some- 
times of use since it compresses the representation of numbers | to infinity 
onto a line of finite length. The idea is illustrated in Fig. 3-1 (b) where, on the 
horizontal axis, the integer n is associated with a point distant 1/n to the left 
of a fixed point P. The left end point of the line segment is then associated 
with the value 1, the mid-point with the value 2, and so on, with the point P 
itself corresponding to an infinite value of n. 

An even simpler graphical representation than either of these is often 
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used in which the values of successive terms in the sequence are plotted one- 
dimensionally as points on a straight line relative to some fixed origin. 
Because of the identification of the numerical value of a term of the sequence 
with a point on a line, the behaviour of a sequence is often spoken of in terms 
of the behaviour of the points in this representation (that is; there is a one-one 
mapping of {uz} onto the straight line). In terms of this representation, the same 


(—1)” n 


Fig. 31 Two alternative graphs of sequence 11 + : (a) normal graph; 


(b) compressed horizontal axis. 
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All points u, for n > 5 
lie in this neighbourhood 


— |)” 
Fig. 32 Sequence 1+ aa plotted on line. 


sequence that gave rise to Fig. 3-1 (a) and (b) will appear as in Fig. 3-2. This 
could also have been obtained from Fig. 3-1 (a) and (Ὁ) by projecting the 
points of the graphs horizontally across to meet the vertical axis. 

In each of these three representations, the tendency for the points of the 
sequence {1 + (—1)"/n} to cluster around the value unity as 7 increases is 
obvious and clearly expresses an important property possessed by the sequence. 
We shall now explore this more fully. 

In the sequence just discussed it is obvious that as n increases, so the 
points of the sequence cluster ever closer to the unit point in Fig. 3-2. If we 
adopt the convention of calling an open interval (a, b) containing some fixed 
point a neighbourhood of that point, then it is not difficult to see that any 
neighbourhood of the point unity will contain an infinite number of points 
of the sequence {uy}. In fact in this case we can assert that no matter how small 
the length b — a of the neighbourhood, there will always be an infinite number 
of points in (a, δ) and there will always be a finite number of points outside 
(a, b). This is even true when ἢ — a shrinks virtually to zero! 

The fact that any neighbourhood of the value unity has the property that 
an infinite number of points of the sequence are contained within it, whereas 
only a finite number of points lie without it, is recognized by saying that the 
limit of the sequence is unity. On account of this name the point corresponding 
to the value unity in Fig. 3-2 is called a limit point of the sequence. We shall 
examine the idea of a limit in the next section, and so for the moment will 
confine discussion to limit points. For this we shall require the notion of a 
sub-sequence. Henceforth, by a sub-sequence we shall mean a sequence 
Unis Uns + + + Uy, . +, Of terms belonging to the sequence {un}, where 
M1, N2,.. +, Mm, .. . 1S some numerically ordered set of integers selected 
from the complete set of natural numbers. Thus we, ug, u27, U31,. . . iS a sub- 
sequence of 1, ue, u3,. . . and obviously {we, ug, u27, 431»... .} & {un}. 

In terms of this we now give the following formal definition of a limit 
point of a sequence {un}. 


DEFINITION 3:2. A point u* is said to be a /imit point of the sequence {u,} 
if every neighbourhood of u* contains an infinite number of elements of 
the sequence {u,}. 
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Since we have not insisted that there be a finite number of points outside any 
neighbourhood of a limit point it follows that a sequence may have more than 
one limit point. We shall show by example that a limit point may or may not 
be a member of the sequence that defines it. This result when applied to 
sequences with only one limit point will later be seen to be very important, 
since it provides the justification for the approximation to irrational numbers 
in calculations by rational numbers. In sequences involving only one limit 
point the sequence will be said to converge to the value associated with the 
limit point. This value will be called the /imit of the sequence. 

Not all sequences have limit points and the following examples exhibit 
sequences having three, one, and no limit points, respectively. 


Example 3-2 
(a) (sin (= + ἢ | has the three limit points —1, 0, and 1, of which 0 
7 


is ἃ member of the sequence and the other two are 
not. The sequence does not converge. 


n 


(Ὁ) (1 . [na has only one limit point at zero which is a member 
πῶ ἘΝ of the sequence. The sequence converges to zero. 
(c) {n?} has no limit point and so the sequence does not 
converge. 


One of the most important applications of the notion of a sequence is to 
the study of series. The difficulty here is to give a meaning to the sum of an 
infinite number of terms. What, for example, is the meaning of 


oe lk 


n=wI ni 


(A) 
The solution is to be found in the behaviour of the sequence {s»} defined by 
ae 
Sm = Σ nl 
The first few terms of the sequence {sm} are 


] 


] 


I Ι | Ι 
2} 58. Ξ 1 Ἐπ τ 54 ΞΞ  Ἐ ΤΩΣ Τ 
and obviously all such terms sm will only involve the sum of a finite number 
of numbers. For obvious reasons Sm is called the mth partial sum of the series 
(A). The interpretation of the infinite sum (A) is to be found in the behaviour 
of the Nth term of {sm}, namely the Mth partial sum sy, as N tends to infinity. 
If {sm} has only one limit point at which sm tends to some number S, then this 


will be called the sum of the series. If S is infinite the series will be said to 
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diverge. A moment’s reflection will show the reader that this is the practical 
approach to the problem, since the term sy is the sum of the first N terms of 
the infinite series (A), and it seems reasonable to assume that when the value 
of (A) is finite, it must be close to the value sy, when N is suitably large. 

These preliminary ideas on series must suffice for now, but we shall take 
them up again later and devise tests to determine whether series are convergent 
or divergent. 7 


9.2 Limits of sequences 


The term limit was first introduced intuitively in the previous section in con- 
nection with a sequence {u»} which had only one limit point. As 7 increases so 
the points representing the terms uy cluster ever closer to the limit point 
whose value L, say, is the limit of the sequence. This idea of a limit is correct 
in spirit but it is not very satisfactory from the mathematical manipulative 
point of view since the phrase ‘cluster ever closer to’ is far too vague. The 
difficulty of making the expression ‘limit’ precise is connected with the exact 
meaning we give to this phrase. 

Our difficulty can be resolved if we recall that any neighbourhood of a 
limit point will contain an infinite number of points of the sequence and, 
if there is only one limit point, will exclude only a finite number of points. 
Thinking in terms of numbers rather than points, a neighbourhood of a limit 
point is simply an open interval of the line on which the numbers Un are 
plotted and we already have a notation for representing such an interval. 
Suppose, for convenience, that the neighbourhood is symmetrical about the 
number L and of width 2e, where ε is some arbitrarily small positive number. 
Then a variable u will be inside this neighbourhood if L ~e <u<L+e. 
Recalling the definition of ‘absolute value’, this inequality can be rewritten 
concisely as |u — ἢ] < ¢. Different values of «> 0 determine different 
neighbourhoods, and if w is identified with the term u», of the sequence, then 
L is the limit of the sequence if, no matter how small ε may become, only a 
finite number of terms μῃ lie outside the neighbourhood and an infinite 
number lie within it. 

We can now give a proper definition of a limit. 


DEFINITION 3:3 The sequence {u,} will be said to tend to the Jimit L if, 
and only if, for any arbitrarily small positive number e, there exists an integer 
N such that 


n>N=> |un,—L| <e. 


Let us test our definition on the sequence {un} with u, = 1 + (—1)*/n. 
We already know that this sequence has only one limit point at the value 
unity, and consequently our definition should show that the limit is unity. 
Suppose, for the sake of argument, that we check to see that the definition is 
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satisfied if ε = 1/100. To do this we must find a number JN such that when 
n > N we have 


(—1)” l 
(1+ 1 “τοῦ 
This result is obviously equivalent to the requirement that (I/n) < 1/100 
which will be true for any value of m greater than 100. Hence if we take 
N = 100 the conditions of the definition are satisfied. There are thus 100 
terms outside the neighbourhood and an infinite number within it. 

Had we demanded a much smaller value of ε, say ε = 107°, the identical 
argument would have shown that the definition is satisfied if N = 108. 
There would now be a very large number of terms outside the neighbourhood 
0:999999 < uz < 1-000001, in fact 106 in all, but this is still a finite number 
whereas the number of terms within the neighbourhood is still infinite. 
Clearly, however small the value of ε, the conditions of the definition will still 
apply showing that it is in accord with our earlier intuitive ideas. 

_ In general, when the sequence {un} has a limit L, so that we say it converges 
to L, we shall write 


lim uy, = L. 

n—> 
Whenever using this notation for a limit the reader must always keep in 
mind the underlying formal definition just given. 

The definition and the illustrative example just given show that when a 
sequence has only one limit point, then it must converge to the value associ- 
ated with that limit point. Any sequence such as {un} with un = sin {ar(n? + 1)/2n} 
cannot have a limit, for it has three limit points at —1, 0, and 1 and any small 
neighbourhood taken about any one must, of necessity, exclude the infinitely 
many terms associated with the other two. Such a sequence does not converge. 

Frequently the limit of a sequence is of more importance than its individual 
terms, and in such circumstances the notation lim μῃ is advantageous in that 


Ἦν Ὁ 
it focusses attention on the general term μη of the sequence. The result of the 
limiting operation is often readily deduced from the general term as these 
examples indicate. 


Example 3.3 Determine the limits in each of the following: 


ie — 1)(n + 4)(n — Ἴ 


(a) lim = 


n> 


aoa) 
(b) tim [+ Ste ]: 
n n 


n— οὦ n2 


n+l n+1 
(c) lim a 


re 52 Sai 7" 
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(d) lim 


Rh © 


pees ἘΠῚ. 


n2 


Solution (a) The general term is up, = [(2n — 1)(n + 4)(n — 2)] /n3, so that 
expanding the numerator and dividing by 7° gives 


Obviously, as 7 increases, the last three terms comprising up approach zero, 
and in the limit we have 


lim ΓΞ —Da+ Oa -- | = 


73 2; 


Ώ,.- οὦ 


Solution (b) The general term is μμ =[1+24+---+ (n — 1)}/n?, in 
which the numerator is the sum of an arithmetic progression. Now it is 
readily verified that] +2+4+:++--+(n—1) =n(n— 1)/2 so that 


Using the same argument as in (a) above we see at once that as n increases 
SO Un approaches the value 4, whence 


, ΕΣ - _ i 
vo Le nz | 2 


Solution (c) The general term here is uy, = (5¢+1 + 7"+1)/(5" — 75) and by 
dividing numerator and denominator by 7” it may be written: 

_ 3565/7)" +7 

(570) — 1 


Now 5/7 <1 so that (5/7)" will tend to zero as n increases. Thus μῃ will 
approach the value —7. In this case we may write 


n+1 n+1 


ee 55 .-- 75 


nt 


Solution (d) The general term is μῃ = [12 + 22 - + - + n2]/n2, in which 
the numerator is the sum of the squares of the first n natural numbers. Using 
the familiar result 


a n(n + 1)(2n + 1) 


124224+---4y7 ξ 


enables us to write 
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Ν (n + 1)(2n + 1) 
_ 6n 


It is obvious that the numerator is quadratic in n whereas the denominator 
is first degree or linear in ἡ. Hence as n increases without bound, so will un. 
This sequence diverges and we write 


pa 
dr > Οὐ. 


n2 


lim 

n— 0 

Notice that we do not use the equality sign in connection with the symbol 
oo, in accordance with the idea that infinity is not an actual number but 
essentially a limiting process. 

Before continuing our discussion of limits, let us introduce a useful 
notation. In the examples above it is apparent that the value of the limit of a 
sequence involving the ratio of two expressions as n increases, is entirely 
determined by the ratio of the most significant terms in the numerator and 
denominator. In the case of a polynomial involving n, the most significant 
term as n increases is obviously the highest degree term in which it appears. 
Thus in (a), an inspection of the brackets in the numerator shows the most 
significant term to be 2η3, and as the denominator only involves 73, it is at 
once obvious that for large n the ratio will approach (2n3/n3) = 2. 

To streamline limiting arguments of this type, and yet to preserve some- 
thing of the effect of the less significant terms, we now introduce the so-called 
‘big oh’ notation appropriate to functions. 


DEFINITION 3-4 We Say that function f(x) is of the order of the function 
g(x), written f(x) = O(g(x)) if, for some set of values of x 
(a) g(x) > 0 


and 
(b) [f@)| < Mgt), 


where M is some constant. 


The value of the constant M is usually unimportant as for most arguments 
it suffices that such an M should exist. We have these obvious results: 

2x3 + 2x + 1 = O(%), | 

3x + sin x = O(x), 

sin x = O(1), 
where the symbol O(1) has been used to denote a constant. 


In terms of this notation we may write the general term uy in Example 3.3 (a) 
in the simplified form 
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an + O(n? 9 
Un = ow whence Un = 2 τς ) (Α) 
HA 


By virtue of the definition of the symbol ‘big oh’, O(n?) implies an expression 
that is bounded above by Mn?, so that O(n?)/n® => (Mn?)/n?. However, 
M/n — 0 as n increases without bound, so that 

lim un = 2. (B) 

Normally the argument just outlined would be omitted, so that result (B) 
would be written down immediately after (A). 

Implicit in the examples just examined are results which we now combine. 


THEOREM 3:1 If it can be shown that μ1, ue, us, . . . and v1, v2, v3, . 
are two sequences such that lim uw, = 1, and lim vy, = M, then 
Ὧν © n> © 
(4) μι Ἔ σι, ue+ve, ug3+03,... is a sequence such that 
lim (un, + vn) = L + M; 
t— ®D© 
(b) 101, μοῦ, uzv3, . . . is a Sequence such that lim unv, = LM; 
n+ @© 
(c) provided M +0, wi/v1, u2/ve, us/v3, .. . is a sequence such that 
lim (Un/vn) = L/M. 
n—> © 


These assertions are virtually self-evident and so we prove only the first 
result, making full use of our definition of a limit and of the triangle inequality 
of Theorem 1-4, 

Suppose ε is given. Then. because {un} converges to the limit LZ, there 
exists a number Ni such that n > Νὶ > [μῃ — L| < ξε. By the same argu- 
ment there exists another number Nz such that n > No => |un — M| < de. 
Now |(un + Ὁ) — ( + M)| = |(un — L) + (vn -- M)|\< lun — L| + |vn — M|, 
and son > max (Ν᾽, N2) = |(un + vn) -- (L + M)| < fe + fe. Thus, taking 
N = max (Nj, Ne), and given an arbitrarily small positive number ¢, we have 


n> N= |(un + vn) —(L + M)| <e 
or 
lim (un + Un) = L Ἢ Μ. 
In effect, this theorem justifies any argument in which it is asserted that, 


if a is close to A and ὁ is close to B, then a + bis close to A + B, ab is close 
to AB, and, provided ὁ and B + 0, a/b is close to A/B. 


THEOREM 3-2 Let {un} and {vn} be two sequences which both converge to 
the same limit L, and suppose {wn} to be a third sequence. Then if for all n 
greater than some fixed value N, it is true that un < wna < vp, the sequence 
{Wn} converges. Furthermore, the limit of the sequence {iz} is also L. 
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The proof of this theorem is not difficult and so is left to the reader as an 
exercise. In essence it involves two stages. The first is to establish that 
{Un — Wn} and {Wa — vn} are both null sequences in the sense that they con- 
verge to the limit zero. The second involves the use of Theorem 3-1 (a) to 
establish that these two null sequences imply lim wz = L. 


n-> ὦ 


In applications use of this theorem is often confined to proving that a 
given sequence {wn} converges, so that the sequences {wn} and {vn} then need 
to be devised to satisfy the conditions of the theorem. 


Example 3.4 Given that 
l 1 i" I 
2 Qn-2 "3, QW 
use Theorem 3-2 to prove that the sequence {w,} converges and to find the 
limit. 
Now, obviously 


] 
Wh =1+ Nong eo 


1 1 Ι Ι 1 I I 

P+ oto tt" "oh aga = We ΩΝ Τα ὅτ᾽ 
and so using the expression for the sum of a geometric progression we may 
write 


21 — 4} <w, < 2 — ()5:1. 


Thus for the sequence {u,} we take u, = 2[1 — ($)"] and for the sequence 
iUn} we take v, = 2[{| — (4)"*+]. The conditions of the theorem are then 
satisfied, since limu, = limv, = 2. Hence the sequence {w,} converges 


tt—> nh—-> © 


and has for its limit the value 2. 


At this stage in our discussion of sequences the following result should be 
self evident and we state it in the form of a postulate, rather than prove it. 


POSTULATE Every increasing sequence which is bounded above tends to 
a limit. 


The proof of this postulate is outlined in Problem 3.20 at the end of the 
chapter. The details are left to the reader, together with the task of showing 
the consequence that every decreasing sequence which is bounded below must 
also tend to a limit. 

It is this postulate that validates the usual arithmetic procedure for finding 
a square root. In the procedure an additional digit is added to the approxima- 
tion at each stage, thereby giving rise to an increasing sequence that is 
bounded above. With a number such as 4/2 which we know to be irrational, 
this same postulate also justifies its successive approximation by the increasing 
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sequence {un} of rational numbers 1, 1-4, 1-41, 1-414, 1-4142, . Un, . . 
In this case an irrational number V2 is determined as the limit of'a a sequence 
of rationals. The implications are important, since although irrational 
numbers are of frequent occurrence, in our world in which we live we can 
only undertake practical calculations using rationals! 

Not all sequences are defined explicitly by giving an expression for the 
general term uz. Often a sequence is defined recursivaly by giving a formula 
relating the term uy to its predecessor un-1, and then specifying the value of 
ui. This is, of course, a difference equation, but in this context it is customary 
to call any rule of this kind a recurrence relation, and one of considerable 
computational importance is 


l 
un =r fn Ἔ 


2 ᾿Ξ 


where m is an integer greater than unity. 

The particular significance of this recurrence relation stems from the fact 
that by using Theorem 3:2 it is not difficult to prove the rather surprising 
result that {uw} always converges to the limit ”4/a, irrespective of the choice 
of ui provided only that it is positive. The value of the limit is obvious once 
convergence has been established, for denoting it by L and setting xn-1 = Xn 
= L, it follows directly from the recurrence relation that L™ = a. 

Table 3-1 shows the effectiveness of this method as a computational 
procedure or algorithm for computing +/2 to five figures, using three different 
starting values for ui. To use the relation to compute 1/2 we must first set 
m = 2 and a = 2 when it becomes 


-5| +—]. 
Ha = 5 4-1 ae 


Taking as representative the three starting values μι = 1, 1-4, and 5, we 
obtain Table 3-1 in which a dash signifies that no further change occurs in the 
last digit. 


Table 3:1 
Un 

n 

“=! uy = 1-4 μι = 5 
1 Ι 1:4 5 
2 1:5 1-41429 2:7 
3 1.41667 1-41421 1-72037 
4 1-41422 --- 1.44146 
5 1.41421 - 1-41447 
6 — — 141421 
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Obviously convergence is most rapid when the value assumed for u is a 
good approximation to the answer, and much effort may be spared by taking 
a sensible starting approximation. 


3°3 The numbere 


Later we shall use an important mathematical constant that is always denoted 
by the symbol e. This number is both irrational and transcendental, and for 
reference purposes its value to ten decimal places is 


e = 2:718281 8284. 


There are numerous different ways of defining this constant, but although 
these are interesting, our real concern later in this book will be with the 
mathematical use of the constant e. We shall, for example, see how it is of 
fundamental importance in the study of differential equations and in the 
definition of important mathematical functions like the natural logarithm 
and the hyperbolic functions sinh x, cosh x, and tanh x. 

However, the real purpose of this section will not be to study these 
applications, but to examine one interesting definition of e as the limit of a 
particular sequence. This problem provides both a first encounter with e, 
and also a useful illustration of how approximate information may be ex- 
tracted from the properties of a difficult sequence. We shall prove that if 


: 1λ5 
e= lim { + -| | (3-1) 
n— 0 n 


then 2 < e < 3. The problem of determining e correctly to any given number 
of figures will be deferred until we are better equipped for the task. 
Consider the sequence {uy} with the general term 


1\n | 
m= (1+) | 
n 


We will first establish that up, is a strictly increasing sequence, so that 

Un+1 > Un, and then show that the sequence {uy} is bounded above by the 

number 3. The postulate of the previous section then establishes that the 

limit e exists and is such that e < 3. Finally, the lower bound 2 will be added 

as a trivial consequence of the proof used to establish the upper bound. 
First let us expand uy, by the binomial theorem: 


ἘΣ 
β 4 9.5 9 ἢ (γ΄ 


n! n 


Now rewrite this: 


SEC 3.3 THE NUMBER e / 87 


#5 (1-3)(1-2)-- (1-4). (3-2) 


An exactly similar argument applied to up+1 then gives 
Ϊ Ι Ι 1 )( 2 a 
meneame ta ea eel 
1 Ϊ 2 n—1 
ΩΝ iy Gee REED ἢ ΠΕ ΠΝ renee ee 
et east) πὸ ld Ce 


Ι I 2 n 
ce ee ee oes (pee | 
(n+ 1)! n+ 1 n+] n+ 1 
Now all the terms in uy and un+1 are positive and u_+1 has one more term than 


Un. In addition, terms in μη... that are associated with factorials are larger 
than the corresponding terms in uw, because of the obvious inequalities 


{Ξε δή εῦ 
πο κι 


Hence un+1 > Un, showing that {un} is a strictly increasing sequence. 

To show that {un} is bounded above we must try to sum the finite series 
for um and then examine the behaviour of the sum as n increases. As the 
finite series (3-2) stands we can make no progress, but an overestimate of 
this sum can easily be obtained if the terms of the series are simplified. This 
approach will suffice for our purposes, since to prove that the limit e exists, 
we only need to prove that {un} is strictly increasing and bounded above; a 
strict upper bound is not necessary here. It is only needed when the exact 
value of the limit is to be determined. 

If we use the obvious inequalities 


veer 
>(1-s)(1-5)--- (1-4), 


it follows at once from Eqn (3-2) that 


1 Ι I 
ὦ λυ τ; τ᾿ a τ, (3-3) 
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This is still too difficult to sum explicitly, so using the observation: 


1 o1,1 1. ἵν εἶς. 


31 22? gy 28 ἐπ S Qed’ 


we further simplify Eqn (3-3) to the form 


Γ 11 Ι 
- ἘΞ Ἐπ: Ἐ΄ Ὲ 


τὸ ἀνε; 52 53 5π-1᾿ 


(3-4) 
This can now be summed, since after the first term the remaining terms 
form a geometric progression. We arrive at the result 


whence lim uy, < 3. 

The conditions of our postulate are satisfied, so we may conclude that 
{un} has a finite limit e and, furthermore, that e < 3. Examination of Eqn 
(3-2) shows that u, > 2 for all ἡ so that finally we have established our claim 
that | 


2<e< 3. 


The form of argument used to overestimate series (3.2) is often useful and 
the final inequality (3-4) is usually called a majorizing series. 
Closely related to limit (3-1) is the sequence {vn(x)} with general term 


oats) (1 Ῥ *" (8.5) 


To establish the relationship that exists between e and the limit of {v,(x)} 
let us first denote the limit by E(x), so that 


rom mn [(0+3)] 00 


Suppose x > 0 to be any rational number and define an increasing sequence 
{nx} of natural numbers by the requirement that the numbers n,/x are integral. 
Henceforth we shall set Nz = nx/x. Then by restricting 1 to be a member of 
{ng} we may define a sub-sequence {vn,(x)} of {vn(x)} for which Eqn (3-5) may 
be written in the form 


wim θ να) [rT en 


Using the definition of μῃ we see that 


Un, (Xx) = (uy,)”, 
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so that taking the limit as ny — οὐ we have 


E(x) = lim v,,. (x) 


Ne 0 


= lim (uy,)* 
δίᾳτ 2 


= [ lim Uy, ]* = er, 


Νὰ 6.5.9] 


Whence the important result 
E(x) = e*. (3-8) 


With a more subtle argument it can be established that Eqn (3-8) is 
generally true without the restriction of n to the sequence {nz}. This implies 
that the result is true for a// real x. 


Fig. 3.3 Graph of the functions e* and e-%, 


The function e* is one of the most important functions in mathematics 
and it is called the exponential function. Fig. 3-3 shows its behaviour with x. 
Notice that it is an essentially positive function which is strictly monotonic 
increasing with x. Also shown on the figure is the associated function e-*. 


3:4 Limits of functions—continuity 


The notion of the limit of a function f(x) as x tends towards some value a 
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Fig. 3.4 Function f(x) with unbroken graph. 


is intuitively obvious in the case of functions whose graph is an unbroken 
curve. A typical function of this kind is illustrated in Fig. 3-4 from which it 
is easily seen that if x is considered to be a moving point, then f(x) will 
approach the value f(a) as x approaches a from either the left or the right. 
In this case f(x) actually attains the value f(a), and we shall speak of f(a) as 
the ‘limit of f(x) as x tends to a’ and write 


lim f(x) = fla). 


Thus, if f(x) = x8 — 2x? + x + 3, then clearly in this case lim f(x) 
X—>2 


= 5 = f(2). A slightly less obvious example involves finding lim f(x) when 
r—>1 


3 


γίχ -- Ἰ 


x—1 


70) = 


since the formal substitution of x = 1 in f(x) seems to yield 0/0 which is 
meaningless as it stands. The difficulty here is easily resolved by cancelling a 
factor (/x — 1) in the numerator and denominator to give 


J 
Jx+1 


from which it is apparent that lim f(x) = 4. 
x—1 


70) = 


In effect, the intuitive notion involved in the limit of a function is essen- 
tially the same as that for the limit of a sequence. Namely, we say that L is the 
limit of f(x) as x tends to a if, for all x sufficiently close to a, f(x) is close to L. 
In fact, the determination of the value of the limit Z involves the behaviour of 

J (x) near to x = a, but does not consider the actual value of f(x) at x = a. 


Range 


70 )-1 <e 
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᾿ (x) 
y23 


0 25.  ατὸὃ ὃ -δΡ. ὃ 
Ὡ-..-Ὀ..-. end 
Domain 0 < |x — αἱ «ὃ Domain 0 < {x — bi < 6’ 


Fig. 3.5 Function f(x) has a smooth graph and attains the limit L at x = a. 


Whether or not f(a) is actually equal to L, as was the case above, is immaterial. 
By only slightly modifying our definition of the limit of a sequence, we arrive 
at the following definition of the limit of a function, which is illustrated in 
Fig. 3-5, and will be used for our subsequent discussion of continuity. 


DEFINITION 3:5 The function f(x) will be said to tend to the limit Z as x 
tends to a if, and only if, for any arbitrarily small positive number e¢, there 
exists a small positive number 6 such that 


0< |x—-—al <d6=>|f@—-L| <e. 
The significance of the condition 0 < |x — αἱ <6 is that the value 


J(@ is specifically excluded from consideration as being irrelevant to the 
determination of the limit. Thus, if 


_ fl+x for x £1, 
fe) = {5 for x = 1, 
then lim f(x) = 2, despite the fact that f(1) = 5. 
x1 


If the graph of a function f(% is not unbroken then more care must be 
exercised when discussing the notion of a limit. The reason can be seen after 
examination of Fig. 3-6 in which the graph has a break at x = c, at which 
point the functional value f(c) has been allocated arbitrarily. This graph 
defines a perfectly satisfactory function, but as x approaches c from either the 
left or the right, so f(x) approaches either the value L_- or 1.6 which are 
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Fig. 3-6 Function f(x) has broken graph. 


obviously limits in some sense. Furthermore L_- 4 L, and neither is equal to 
f(c). To take account of this, we introduce the concepts of a limit from the 
left and a limit from the right. 

To simplify the explanation we shall write x -- a— in place of ‘x tends to 
a from the left’ and x — a+ in place of ‘x tends to a from the right’. In terms 
of this notation the function f(x) in Fig. 3-6 has the property that lim = L_ 


4--»6 — 
and lim = L; which is indicated in the diagram by means of arrows. Once 
ttt 


again, in arriving at the limits from the left and right of a point, the functional 
value itself at that point is not involved. It may or may not equal one of 
the two limits so defined. These ideas may be expressed formally as a definition. 


DEFINITION 3-6 The function f(x) will be said to have the /eft-hand limit, 
or limit from the left, L_ as x + a— if, and only if, for any arbitrarily small 
positive number ε, there exists a small positive number ὃ such that 


O0<a-—x<d=>|f() -L_| «- ε. 


A corresponding definition exists for the right-hand limit, or limit from 
the right, as x > a+ in which L- is replaced by 2... 

Notice that the function f(x) in Fig. 3-6 only has one-sided limits at 
x = a and x = d and, even though f(x) has a cusp at x = ὁ, and so is not 
smooth there, it nevertheless still has a limit in the ordinary sense at that 
point. This is because of the following obvious result. 
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THEOREM 3:3 If f(x) has identical left- and right-hand limits at a point 
x = aso that L- = L; = L, say, then lim f(x) exists and is also equal to L. 
α--»α 

We shall usually resolve simple limit problems of the type just discussed 
either intuitively or, perhaps, by appeal to a graph. However, for complete- 
ness, we now apply the formal definition of a left-hand limit to a specific 
function to show, in principle, how it may be used as an analytical tool in 
less obvious situations. 

For our example we apply the formal definition of a left-hand limit at the 
point x = 1 to the function 


x4 for x < 1, 
fe) = {9 forx >], 


Clearly the left-hand limit at x = 1 is determined only by the behaviour 
of f(x) to the left of that point. The behaviour of f(x) for x > 1 is irrelevant 
to the determination of lim f(x). Obviously, as x + 1~ so x? — 1, and thus, 


e->1 — 
intuitively, lim f(x) = 1. 
a1 — 

If our intuitive argument is correct and this limit is in agreement with our 
definition, we must show that for any e > Ὁ we can find a positive 6, which 
will probably depend on «, such that |x? — 1| - ε when x->1I— and 
0<1—x - ὃ or, equivalently, 1 —-d<x<l. 

We have | f(x) — L-| = |x? — 1] = | — D@t+ 1)]} = |x -1]. |x + 1, 
but since |x — 1| < 6 this becomes 


|x2 — 1] < δ᾽χ + 1]. (A) 


Since x < 1, we overestimate x in (A) if we replace it by the value unity so 
that we have 


|x? — 1] < 26. (B) 


Finally, to make this expression less than any small positive number e, 
we need only make 26 < e. This finally proves that lim f(x) = 1. 
«--»]} — 


Some numbers might help here. Suppose, for example, we wish to find the 
condition that f(x) should be within 0-001 of the left-hand limit at x = 1. 
This amounts to asking that |x? — 1| < 0-001, which is equivalent to setting 
e = 0-001. Hence, as ὃ < $e = 00005, our x-inequality 1 — ὃ < x < 1 tells 
us that the required condition on f(x) will be satisfied provided 0-9995 < x < 1. 

In higher mathematics this analytical approach is indispensable but, as 
already remarked, for our purposes a graphical approach to the limit of a 
function must suffice in most cases. An exception is the discussion of indeter- 
minate forms which involve finding the limit of a quotient as x approaches 
some value at which both: numerator and denominator vanish. This will 
be taken up again later as an application of calculus though the reader should 
notice that we have already resolved one such simple problem involving a 
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limit of the form 0/0. 
Although a function such as 


_ {x24 1 for x an integer 
ΛΟ) = ᾿ for all other x 


is a perfectly satisfactory function from the mathematical point of view, it is 
not likely to occur in connection with physical problems. We make this 
assertion because in the physical world functional relationships are usually 
smoothly changing in the sense that a small change in the independent 
variable usually produces only a small change in the dependent variable. 
This is not always the case however and, for example, in gas flows involving a 
gas shock wave the gas pressure experiences a sudden jump across a geo- 
metrical surface in space called the shock front. Hence a graph of the gas 
pressure p across a plane shock at x = a, as a function of the distance x 
measured normal to the shock front, could appear as in Fig. 3-7. 


Shock front 
P,—P, 1S pressure 


jump across shock 
front 


Fig. 3.7 Gas pressure p as a function of distance normal to shock front at x = a. 


Nevertheless, despite the existence of common physical situations of this 
type a function as erratic as fi(x) is not likely to be encountered in the real 
world. Aside from points at which a jump occurs, the ‘reasonable’ functions 
that occur in physics and engineering must be expected to have the smooth- 
ness-of-change property we described earlier. 

This smoothness-of-change property is given the mathematical name 
continuity and plays an important part throughout all mathematical analysis. 
If the reader pauses to think for a moment he will see that the following 
definition describes continuity in terms of the left- and right-hand limits. 


DEFINITION 3:7 The function /(x) is said to be continuous at x = xo if: 


(a) lim f(x) = lim f@) ξ αὶ 


φ-ρᾶρ-- x29 + 
and 


(Ὁ) f(xo) = L. 
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In this definition, (a) demands the equality of the left- and right-hand 
limits and (b) ensures that there is no ‘gap’ in the graph of f(x) at x = xo. 
That is to say that the point (xo, f(xo)) lies on an unbroken curve and so 
coincides with the limits (a). An alternative, but equivalent, definition of 
continuity that is often used replaces (a) by the requirement that lim f(x) 


= 1, but still retains (b). Either form of definition is equally good but we 
have chosen to emphasize the ideas of left- and right-hand limits since they 
find important applications in engineering and physics. 

Continuity essentially describes a property of a function in the neigh- 
bourhood of a point of interest and not just at the point itself. Accordingly, 
a function will be said to be continuous in the interval (a, 5) if it is continuous 
at all points x within (a, δ). 

Notice that the effect of condition (b) of our definition on a function such 
as 

x3 + 1 for x #1 
fey = (2 forx = 1 


is to show that f(x) is continuous everywhere except at x = 1. 

Let us paraphrase the notion of continuity. In effect, by requiring that a 
function f(x) be continuous at x = a, we are insisting that if the variation 
of the function about the value L = f(a) does not exceed +-e, where e > 0 
is arbitrary, then we can find an x-interval of width 26 centred on x =a 
within which this property is always true. This is illustrated by Fig. 3-5, which 
also indicates that in general the number ὃ depends on both ε and the value 
of x at which f(x) is continuous. Thus for the same value of e, the interval 
about x = a is of width 26, whereas the interval about x = ὁ is of width 
20’, with δ΄ + ὃ. 

If the function f(x) is continuous in a closed interval [x1, x2] and « is 
given, consider the point x = b at which the function changes most rapidly, 
and find the appropriate interval of width 26’ centred on x = ὁ in which the 
functional variation from f(b) does not exceed +-e. Because the functional 
variation at x = b was the greatest of any point in [x1, xe], it is obvious that 
if this same interval of length 20’ is associated with any other point x’ in 
[x1, x2], then the functional variation within that interval will certainly differ 
by less than +-¢ from the value f(x’). Hence we can assert that for a function 
J (x) which is continuous in a closed interval, when given an ¢ it is possible 
to find a number ὃ for the definition of continuity which depends only on ε 
and in no way on the value of x at which continuity is being discussed. 
Because of this continuity property which applies uniformly to points 
throughout the closed interval [x1, x2] we speak of such functions as being 
uniformly continuous. This concept proves to be of extreme importance when 
these ideas are pursued further. 

The requirement of continuity in a closed interval cannot be relaxed, for 
then the result is no longer true. For example, the function f(x) = 1/x defined 
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in the semi-open interval (0, 2] is continuous, but not uniformly continuous. 
This is because for any given «, the closer we take our point x’ to the origin, 
the smaller must we take the value of ὃ in order to satisfy | f(x) — f(x’)| < ε 
for |x — x’| < 6. There is obviously no smallest value of 6 that will apply to 
the entire interval. 

There are a number of immediate consequences of the definition of a 
limit of a function and of the definition of continuity which we now state as 
two important theorems, 


THEOREM 3-4 (limits) Suppose that es J(*) = L and lim g(x) = M, then 
-» 70 ; τ--»ῦῖὸ 


(a) a UG) + ip =L+M; 
(b) tim 7 ΡΟ." 
(c) provided M +0, a m f(t = [/M. 


The proof of these results is similar in all respects to the proof of Theorem 
3-1 and since a representative example was presented there we shall not 
repeat the argument again. 


THEOREM 3:5 (continuity) If f(x) and g(x) are continuous at x = xo, then 
so also are the functions 


(a) f(x) + g(); 
(0) Χο); 
(c) f(x)/g(x), provided g(xo) 4 0. 


If, furthermore, f(x) is continuous at x = xo and g(u) is continuous at 
u = f(xo), then the continuous function of a continuous function g[f(x)] 
is continuous at x = Xo. 


Once again the proof of this theorem is similar in all respects to the proof 
of Theorem 3-1. However for the curious reader we shall prove result 3-5 (a), 
using the alternative definition of continuity that we mentioned. 

To prove f(x) + g(x) is continuous at x = x9 we must establish that 
lim (f(x) + g(x)) = LE exists and that f(xo) + g(xo) = L. Now as f(x) and 


L—>XQ 
g(x) are continuous at x = xo by supposition, then lim f(x) = f(xo) and 
με, {1} 


lim g(x) = g(xo) and so for any positive ¢ there must exist positive numbers 
χ-- 2 


δὶ and δὲ such that |x — xo| < 61> [f(x) — f(%o)| < de and |x — χοὶ 
< δὲ => |g(x) — g@o)| < de. Now, |(f(x) + σοῦ — χοῦ + g(x0))| = 
(f(x) — f(xo)) # Cg) — 50 .0}}} < |f(%) — f(%o)| + |g@) — g(xo)| and 
|x — xo| < smaller of (δι, d2) > | f(x) — f(xo)| + |g(x) — g(xo)| < de + fe. 
Thus, given any positive e, we have established that by taking ὃ less than either 
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δὶ OF δὲ we ensure that |(f(x) + g(x)) — (f(x0) + g(xo))| < ε. This formally 
proves our assertion. The proofs of results (b) and (c) are similar. 

Arguments involving continuity usually rely for their success on the 
knowledge that certain familiar functions are continuous. Once a small list 
of such functions has been established it can then be considerably enlarged 
by repeated applications of Theorem 3-5. Accordingly, we present below a 
table of functions, in each case stating the intervals in which they are con- 
tinuous. No proof will be given for most entries since the results are obvious 
from the graphs but for the sake of completeness we shall formally prove the 
first three entries. 


Example 3:5 


(a) Given that C = constant, the function f(x) = C is continuous ever)- 
where. 


The proof is trivial, since for any x = xo, f(xo) = C showing that the defini- 
tion is always satisfied. 


(b) The function f(x) = x is continuous everywhere. 


The proof is again trivial, but let us indicate how the alternative definition of 


continuity may be used. We must prove that for all xo, lim f(x) exists and is 
Lz 

equal to f(x). Now it is obvious from the definition of T(x) that f(xo) = Xo. 

Also, for any x = xo and given e>0, | f(x) —f(xo)| = |x — xo] - ε 

=> |x — xo| < ε 80 that in this case the quantity ὃ = ε. The function is thus 

continuous at x = xo and, as xo was arbitrary, it finally follows that f(x) = x 

is continuous everywhere. 


(c) The function f(x) = x" with n a positive integer is continuous every- 
where, 


We give a proof by induction. Suppose the result is true for some 7 so that 
x" is Continuous at x = Xo for all xo. Now x®+1 = x. x”, and we have just 
proved that x is continuous at xo. Hence, using Theorem 3-4 (b), x®+1 is 
continuous. The result is true for m = 1 and so by the principle of induction 
it is true for all n. With a little more care this result can be shown to be true 
for any real positive n and not just for πὶ a natural number. 

The information contained in this table is likely to be useful on many 
occasions and so should be memorized. Its application, together with 
Theorem 3-5, to questions of continuity is usually immediate. Thus, for 
example, the function f(x) = 1/x + sin x is continuous everywhere except at 
the point x = 0, and f(x) = (x™ + ayxm-1 pee ey am)/sin x, with m > 0, 
is continuous everywhere except at the points x = nz for whichn isan integer. 

Finally, in preparation for our use of limits in connection with the tech- 
niques of differentiation, we extend the O-notation to include functions of 
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Table 3.2 Short list of continuous functions 
SS eS a ae ee - - τὸ 


Function f(x) Interval over which f(x) is continuous 
C (constant) (-- οὐ, 0) 
x (— 0, co) 
x" (n > 0) (— 0, 0) 
x-" (n > 0) (— οὐ, 90) excluding point x = 0 
| x (—0o, 00) 
x” bax" Ἢ... +an(n>0) (— οὐ, οο) 
n n-1 o 1 8 Ἂ: 
πο ον προ τσὴ (-- 00, 00) excluding the zeros of the denominator 
xm + hbyxm-l +--+ ++ bm 
sin x (—%, 0) 
cos x (— 00, ὦ) 
tan x (2n — Ns <x< (2n+1) 5 integral n 
vr π 
sec x (2n — 1) 5 <x < (2n + 1) 5 integral n 
cosec x nn <x <(n + 1)π, integral n 
cot x nt <x <(n + 1)z, integral n 


smaller order. Henceforth, we shall write 


fx) = 0(g(x)) as x > x0 


with the meaning that 


lim fx) = 0, 
nay 2(X 
The symbol ὁ is read ‘little oh’ and in words the statement asserts that the 


function f(x) is of smaller order than g(x) as x —> xo. For example, we may 
write (1 + x?)? = 1 + 3x? + o(x3) as x0, since (1 + x2)? — 1 — 3x2 
= 3x4 + x6 = o(x3) as x > 0. 


3.5 Functions of several variables—limits, continuity 


The related concepts of a limit and the continuity of a function extend without 
difficulty to functions of more than one independent variable, provided only 
that the notion of the proximity of two points is suitably extended. The ideas 
involved here can best be appreciated if we confine attention to functions 
I(x, y) of the two independent variables x and γ. 

Let us suppose that f(x, y) has for its domain of definition some region D 
in the (x, y)-plane and that (xo, yo) is some point interior to D. Then, before 
considering f(x, y), we must first make clear what is to be meant by x —> xo, 
τ yo in D. 
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| 
| 
x 


Fig. 3.8 Paths for which the point (x, y) > (xo, yo). 


An inspection of Fig. 3-8 shows that starting from the points P and Q in 
D, both the full curve and the dotted curve describe possible paths by which 
x and y may tend to xo and yo. In general, we shall write x — xo, y > yo, or, 
Say that the point (x,y) tends to the point (xo, yo), if p—0, where 
p = V(x — xo)? + (y — ΧΟ] is the distance between the moving point 
(x, y) and the fixed point (xo, yo). This simple device then allows us to 
interpret a statement about the two variables x and y in terms of a statement 
about the single variable p. By confining attention to a circular region of 
radius ὃ centred on (xo, yo) we may conveniently define a neighbourhood of 
the point (xo, yo). Any rectangle or other simple closed geometrical curve 
containing (xo, yo) would, of course, serve equally well to define a neighbour- 
hood of (xo, yo). When using such a neighbourhood it may or may not be 
necessary to exclude the boundary and the point (xo, yo) itself from the defini- 
tion of the neighbourhood. 

Thus, for example, the square x = 0, y = 0, x = 1, and y = 1 defines a 
neighbourhood of the point (4, 4). The function 

f(x, y) = Uxye — DY -- D& -- DY — D} 
is defined in this neighbourhood, but not at (4, 4), on the boundary or on 
x = 4, y =H. 

Definition 3-8 is now proposed, with this interpretation of x — Χο, 
y > yo firmly in mind. 
DEFINITION 3:8 The function f(x, y) will be said to tend to the limit Z as 
x —> xo and y — yo, and we shall write 
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lim f(x, y) = L, 
X—-+XHO 
y—VO 


if, and only if, the limit L is independent of the path followed by the point 
(x, y) aS x > xo and y > yo. 


As before, we do not necessarily require that f(xo, yo) = L, as the func- 
tional value actually at the limit point (xo, yo) is not involved in the limit 
process. Ifit can be established that the result of the limiting operation depends 
on the path taken then, demonstrably, the function has no limit. The following 
examples make these ideas clear and, on account of their simplicity, are 
offered without proof. 


Example 3-6 
2x 2. 2 
If SS Hien: tte -. 
(a) T(x y) ἊΣ + 2 | en ἀν τ re +i Π 
y—>3 
: xy + J ee) 2st eee 
(b) iffy) = -ῷ ae then lim τν τ ΠΝ 0: 
yl 
sin xy sin xy 4 


. 
3 


then lim 


ἢ opel Seem Ba sy ek 
(c) 1 I(x, y) x2 + 7 + | ae x2 + y? + | 8ὃ- πὸ 
μ--»1 


(ὦ if f(y) = =O 


, then lim f(x,y) does not exist since 
τα 1) ee 


yl 
lim f(x, y) = 1 if taken along the line y = x, but lim f(x, y) = —1 
yl yal 
if taken along the line y = 2 — x. 


As might be expected, the concept of continuity of a function f(x, y) of 
two variables then follows as a direct extension of the definition of a limit. 


DEFINITION 3:9 The function f(x, y) will be said to be continuous at the 
point (xo, yo) if: 
(a) lim f(x, y) = L exists 


L->XHQ 
¥y—Yo 


and 
(b) f(xo, yo) = L. 


We shall say that f(x, y) is continuous in a region if it is continuous at all 
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points (x, y) belonging to that region. Notice that condition (a) demands that 
J (x, y) has a unique limit as x -- xo and y — yo, and condition (Ὁ) then ensures 
that there is no ‘hole’ in the surface z = f(x, y) at the point (xo, yo). The 
continuity of a function f(x, y) is illustrated in Fig. 3-9 where a circular 
neighbourhood of the point (xo, yo) is shown in relation to the surface. In 
effect, continuity of f(x, y) is simply requiring that a small change in location 
of the point (x, y) will cause only a small change in z = f(x, y). 


Fig. 3-9 Continuity of f(x, y) at (xo, yo) and discontinuity at (a, δ). 


In Fig. 3-9 the point (a, δ) has been deliberately detached from the other- 
wise unbroken surface z = f(x,y), so that the function f(x, y) does not 
satisfy the definition there and hence is not continuous at that single point. In 
general, a function of one or more variables which is not continuous at a 
point will be said to have a discontinuity at that point or, alternatively, to be 
discontinuous there. Thus the function of one variable shown in Fig. 3-6 
has a discontinuity at x = c and the function of two variables shown in Fig. 
3-9 is discontinuous at x = a, y = b. 

These ideas also extend to functions of several real variables in an obvious 
manner once the ‘distance’ between two points has been defined satisfactorily. 
For functions f(x, y, z) of the three independent variables x, y, z a suitable 
distance function between points (x1, y1, z1) and (xo, yo, Zo) is the linear dis- 
tance between them when plotted as points relative to three mutually perpen- 
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dicular Cartesian axes. The distance p is then given by the Pythagoras rule 
as p = {(x1 — x0)? + (v1 — yo)? + (σι — 20)?}!/2. 

The interpretation of distance in the so-called finite dimensional spaces of 
n-dimensions generated by functions of n independent variables is of con- 
siderable importance in mathematics. Essentially, of any function p(P, Q) 
measuring the distance between points P and Q in the space we require that 
for any points P, Q, and R: 


(a) p(P, 9) > 0, 

(Ὁ) p(P, Q) = 0 if, and only if, P = Q, 
(c) p(P, Q) = p(Q, P), 

(d) p(P, R) < p(P, Q) + p(Q, R). 


It is easy to check that the two distance functions already defined satisfy the 
above conditions, but this will be left as an exercise for the reader. _ 

Again the determination of the regions in which any given function is 
continuous will usually be done either on an intuitive or on a graphical basis. 
Thus, in Example 3-6 it is easily seen that: 


(a) f(x, y) = ain is continuous everywhere; 
I 
(b) f(x, y) = oe = is continuous everywhere except at x = 0, y = 0; 
(c) f(x, y) = eee is continuous everywhere; 
x(y — 1), 
(d) f(x, y) = ————~ is continuous everywhere except at (0, 0) and (1, 1) 


γα — 1) 
and along x = 1 andy = 0. 


46 Auseful connecting theorem 


By now it will have become apparent that there is a strong connection 
between theorems concerning limits of sequences and the corresponding 
theorems concerning limits of functions. In fact, with only trivial modification, 
most limit theorems that are true for sequences are also true for functions. 
Naturally this is no coincidence and the reason is explained by this connecting 
theorem. 


THEOREM 3.6 Let f(x) be a function defined for all x in some interval 
a<x <b. Further, let {xn} be a sequence defined in the same interval which 
converges to a limit « that is not a member of the sequence. Then if, and only 
if, lim f(%n) = L for each such sequence {xp}, it follows that lim f(x) = L. 


T—> 0 τ--τὰ 


The proof of this connecting theorem comprises two distinct parts. First 
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it must be established that if lim f(x) = L, then sequences {xy} exist having 
ΦΟΡᾺ 


the required property. Second, the converse result must be proved; that if the 
required sequences {x} exist, then lim f(x) = L. Together, these two results 


will ensure that the theorem works in both directions, so that corresponding 
function and sequence limit theorems satisfying the necessary conditions may 
be freely interchanged without further question. 

The first part of the proof is a direct consequence of Definitions 3-3 and 
3-5. It follows from Definition 3-5 that when x is confined to some neighbour- 
hood N, of «, then f(x) is confined to a neighbourhood Nz of L. From 
Definition 3:3, since {x,} has the limit «, there must be some number no 
such that for n > no it follows that f(xn) will also be confined to the same 
neighbourhood Nz of L. 

The second step is a little harder, since it involves an indirect proof by 
contradiction. It involves showing that if we assume that lim f(x) τὸ L, 


then a sequence {Zn} can be found satisfying all the requirements of the 
theorem, for which lim f(z,) 4 L. Hence the contradiction showing that 


tt— © 
the conclusion lim f(x) # L was false. We leave the details of this to any 
oS 
interested reader as an exercise. 
To close this chapter, we shall use this theorem together with geometrical 


arguments to establish the three useful limits: 


sin «8 
lim ( = α; (3-9) 
00 0 
= θ 

0-0 θ 

1 — cos «a a2 
hi a ἘΠῚ πάρι : 
nae ( G2 2 G ) 


These limits are all of the indeterminate variety mentioned earlier and, 
although this topic will receive special mention in a subsequent chapter, it is 
important for the development of our work that they be examined now. We 
shall establish that they are all related to the single limit 


sin 9 
lim {| ——} = 1, 
6—0 ( 0 


which we prove first. 

Consider Fig. 3-10 which represents a circular arc of unit radius with its 
centre at O, inscribed in the right-angled triangle OAB. 

Then it is obvious that | 

Area of triangle OAC < Area of sector OAC < Area of triangle OAB. 
Expressed in terms of the angle 9 measured in radians this becomes 
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4sin 6 < 16 < } tan 8, 
from which we see that 


in 0 
cos) < = < l. (A) 


0 
Fig. 3-10 Area inequalities. 


This result must be true for all acute angles 6 and, in particular, for the 
values of the sequence {0}} defined by 9, = 1/n. Thus (A) takes the form 


sin Oy, 


<] (B) 


cos 6, < 


n 


and, since lim 6, = 0 where the limit 15 not a member of the sequence, we 


may combine Theorems 3-2 and 3-6 to deduce that 
in 8 
lim (=) = (3-12) 


To establish limit (3-9) it is only necessary to replace 6 in Eqn (3-12) by 
x4, giving rise to 


ἫΝ (= =) ἜΤ 
αθ-τῷ αθ 


or, equivalently, 


? 


sin *) 
lim ( == α 
θ--»0 θ 


The limits (3-10) and (3:11) then follow by using the identity 1 — cos «6 
= 2 sin? ἐαθ to form the expressions 


sin od 
6 3 


— = 2sin μα ( 


and 
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1 —cosaf _ ; (= 1} 
. θ 
Applying result (3-9) to these we finally arrive at the required results 
( — cos 9 
lim | ——_—_—~ 


δ )0.a=0 


é—0 


and 


lim ( -- Cos | 5 (5) a2 
--.-- --.- | —» ~i- = —-. 
θ--κ0 θ3 2 2 
The following general result is sometimes useful and, as we shall show 
by example, may be combined with Eqns (3-9) to (3-11) to give a number 


of interesting results. 
Suppose f(x) and g(x) are two functions such that lim S() = « and 
ta 


lim g(x) = β, where « and β are both finite. Then, clearly, 
t—a 


lim g(x) 
lim [f@) = [lim fr" = a. 


This result, which is true in general, is of course also true when one or 
more of the limits involved is of the form Eqns (3-9) to (3-11). 
Example 3-7 
x84 2x2 +441 [1 ~ cos 2(7 — 1)}/(a — 1)? 
x? + 2x + 3 


1 — cos 3x\ (sin 22/2) 
( x? 


(a) lim ( 


χ--»"Ἰ 


(b) lim 


x—0 
Solution to (a) Here f(x) = (x8 + 2x2 + x4 1)/(x? + 2x + 3), so that 
lim f(x) = 5/6 and as g(x) = [1 — cos 2(x — 1))/(x ~ 1)?, it follows from 
4.--»] 

Eqn (3:11) that lim g(x) = 2. Hence, lim [70 )]σ = (5/6)? = 25/36. 
x] z—] 


Solution to (b) In this case f(x) = (1 — cos 3x)/x2 and 2(x) = (sin 2x)/x. 
A direct application of Eqns (3-9) and (3-11) then shows that lim F(x) = 9/2 


and lim g(x) = 2 and thus lim [f(x)]}? = (9/2)2 = 81/4. 
£—>0 x—+0 


PROBLEMS 
Section 3-1 
3-1 Give an example of a numerical sequence and of a non-numerical sequenice. 
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3.2 Use the terms bounded, unbounded, strictly monotonic increasing, and 
strictly monotonic decreasing to classify the sequences {un} which have the 
following general terms: 


(a) un = (—n)"11; (Ὁ) μη = (: - Ἂ 

(Cc) un = sin (1/n); (4) μη = 2+ (—1)"; 
_ att, _nt+3 

(€) un = FF? ey 


3-3 Give an example of each of the following types of sequence: 


(a) bounded; (b) strictly monotonic decreasing; (c) monotonic decreasing; 
(d) strictly monotonic increasing; (6) bounded above; (f) bounded below. 


3-4 Use an ordinary graph to plot the first ten terms of the sequence {un} for which 
Un = (—1)"(2 + 2)/n. 


3°5 Using the device described in connection with Fig. 3-1 (Ὁ) to compress the 
horizontal axis, plot the first five terms of the sequences {un} which have the 
general terms: 


(a) un = (-1)" (Fert): 


Le | 
(b) un = 1 +> Th 
γε: °° 


Section 3-2 
3-6 Find a neighbourhood (a, δ) of the sequence {1 + (—1)"/n} such that 


(a) there are 100 terms outside it; 
(b) there are 10,000 terms outside it. 


Deduce that there are infinitely many terms inside any such neighbourhood. 


3:7 Find a neighbourhood (a, δ) of the sequence {(2n + 1)/n} such that 


(a) there are 10 terms outside it; 
(b) there are 1,000 terms outside it. 


3:8 Name the limit points of the sequence {μη} which has the general term up 
= sin [(n + 1)/2]7. Identify the sub-sequences that determine these limit 
points. | 

3-9 Name the limit points of the sequence {un} with the general term un = 
sin [(n? + 1 + 1)/2n]7. Identify the sub-sequences that converge to these 
limit points. 

3°10 Give examples of sequences having (a) no limit point, (Ὁ) one limit point, 
(Ὁ) two limit points. 


3-11 Name the limit points of the sequence {un} which has the general term 


1. 
1 — 35, for n even 
un = 


| wn for n odd. 
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State whether or not the limit points belong to the sequence. 


3:12 Determine the following limits: 
ον τον 
n— οὦ n 
_ (nt -+n— In + 2), 
(Ὁ) mn Gn? + In+ ΤῸ) ᾿ 
Soa ih Goes 
hi. τσ τ τ 
saa ay 
ες n+ (—2)" 
4) lim -- -- --- 
oo Pe Ea ey 


(ea ; =) 


(e) lim 773 


N— © 


3.13 Give an expression for the nth term of the sequence V2, νῶν 2, 
/[2/(2/2)], . . .. Use your result to deduce the limit of the sequence. 


3:14 Determine the limits: 
(a) lim (νι + a) — νη), where a > 0 is any real number; 
Ti C 


n(2 sin n — 3 cos 27). 


(0) oy n2+2n+ 1 ‘ 
: 3πτ2 +- §nt+2 
sae ( 30 — 5" ) 


(4) lim ν([ + a")(a = 0). 


3-15 Use the Ο notation to express the behaviour of the following expressions for 


large x: 
(a) 2x? + x + sin (1/x); (b) 3.2; 
3x3 + 2x +1. x8 sin x + 1. 
©) ΧΙ ©) xB+3 ” 
2 
Θ = 


3-16 Suppose that the sequences {un}, {un}, and {wn} are such that un < wa < Un 
for all n greater than some fixed number πο, and that {un} converges to the limit 
L and {vn} converges to the limit Μ΄. Show by example that the sequence {wn} 
need not converge to a limit. 


3-17 Outline the details of the proof of Theorem 3-2. (Hint: Consider the limits of 
the sequences {un — wn} and {wn — Un}.) 


3:18 Give two different proofs of the convergence of the sequence {un} in which 
1, 1 
Wn = 1+ τ ὡς 7 + 
and then to Theorem 3:2. 


+ —_ appealing first to Theorem 3-1 (a) 


3n-1 
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3:19 Use Theorem 3:2 to prove the convergence of the sequence {un} in which 


1, Me, Σ Ζν, Σ DGB Bt 3\ 7 
unm δα [1 Ὁ ΤῚΣ + pasin(1 Ὁ ἸΣτ zasin(1+ 1ΞῈ ie 


n n}2 
—1. n—1\7 
sin(1 + ἐπ Ἰξ 


3.20 Let {un} be an increasing sequence bounded above by m. Let this bound m, 
together with the members un of the sequence, be represented by points on a 
line. Then, either the mid-point 71 = }(w1 + m) of the line segment between 
μι and m is an upper bound of {un}, or it is not. According as 71 is, or is not, 
an upper bound of {un}, take for the next point mz the mid-point of the half 
line segment to the left or right of 711, respectively. Next, according as mz is 
or is not, an upper bound of {un}, take for the next point m3 the mid-point of 
the quarter line segment to the left or right of m2, respectively. Repeat this 
process indefinitely to generate an infinite sequence of points {mr} as indicated 
in the diagram. 


ΕἾ ὦ 


‘Limit L of {u,} 


Give reasons why 

(a) {mr} has a single limit point ἢ; 

(Ὁ) the fact that {μη} is an increasing sequence implies that lim un = L. 
n—->D 


3-21 Let un = #(un-1 + (@/un-1)) and un = (un — V/a)/(un + Va), where μὰ and 
a are any positive numbers. By showing that vp, = Un-12 = Un-24 = vn-3® 


— +--+ = p,2""” deduce the result 0 « un <| v1 |”. Then, using 
Theorem 3:2, prove that lim vp, = 0 thereby establishing that lim un = ~/a. 
nN © N-> CO 


3.22 Using the algorithm un = ἼΩΝ ἘΝ compute to four figures the first 


Un-1 
five terms in the sequence {un} corresponding to the starting values (a) ma = 1, 
(b) μι = 2. Compare your results with the limiting value /3. 


Un-1" 


five terms in the sequence {un} corresponding to the starting values (a) μι = 1, 
(Ὁ) μὰ = 2. Compare your results with the limiting value 34/5. 


3-23 Using the algorithm un = Hunt Π | compute to four figures the first 


Section 3:3 


The following two related problems show how the approximate behaviour of e* in 
the interval —2 < x < 2 may be inferred directly from the sequence {vn(x)}. 


3:24 Define vn(x) by the expression 
va(x) = (1 + "| 3 
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Use essentially the same arguments as those leading to Eqn (3-4) to prove 
that {vn(x)} is a strictly increasing sequence for any fixed positive x and then 
show that 


ἄπ 5 
tax) ΞῚ extra tag ἘΠ΄ Fa 


By summing this expression and taking the limit as ἡ — οὐ deduce that 


2 
L<er< 5 for O<x< 2. 


Compare this result with Fig. 3-3. 


3:25 Using the same definition of vn(x) as above, form the sub-sequences {vem(x)} 
of even terms and {vam+1(x)} of odd terms. Modify slightly the arguments used 
in the previous example to prove that both sub-sequences are strictly mono- 
tonic decreasing for negative x. Show that {vam+i(x) — vem(x)} is a null 
sequence and hence deduce that both the even and odd sequences tend to the 
same limit. Modify ven(x) to establish that 


x2 x3 xem 


van(x) 5 1 -—x+5 55 ΤΠ παπττ' 
By summing this expression and taking the limit as "1 -- 00 deduce that 
ye 
cet <i for 0O<x <2. 
2% 


Compare this result with Fig. 3-3. 


Section 3-4 
3:26 Determine the following limits of functions: 


2 

(a) lim x3 — x2?+ x41; (b) ἢ τ 5 τ. 

ra | pe! A Ξπ 1 

_ A(x? — 6) . χϑ x?—~x—2 
ς Ππηὶ------....:ὄ 4) lim ---Ο----------- 
ee ee OO" + Dw FD 

h)? — x3 . 
(e) lim ἘΠ τ αι (Ff) lim {4/(x2 + 1000) — +/(x2 — 1000)}; 

r—0 t—> 00 


(g) lim x[\/(x? + 3) — x]. 
τ-» 0 
3:27 Determine these limits when they exist: 


(a) lim f(x) where f(x) a sa 
a) lim f(x) where f(x) = 
a1 i+ sin(x — 1) forx > 1; 


(b) lim ~— 


a—+1 X2 = 
x? + sin 47x for x < 3 


lim fi here = 
(c) al μὰ ss 1 5 + x? for x > 3; 


1 + cos x 


: 2— 11; i . 
(d) lim | x | (ε) lim — 


“--»ὶἐπ 
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3:28 Determine the left- and right-hand limits of these functions at the stated 


3:29 


3°31 


points: 
32:11 +. 51:11 
a) lim : 


1+ 2sinxforx < ἐπ 


Ὁ) lim f(x) where f(x) = 
( eum Je) von for x > ἐπ; 


(c) lim| x? +x—-— 1]; 
M 


—2forx <0 


(ὦ lim f(x) where f(x) [ +|x| for x Ὁ 0; 


e) lim : 
Determine the domains of definition for which these functions are continuous: 
(a) f(y) =x+ Bale (b) f(x) = 1/@? — 1); 
χϑτ x? — 1 x8 + 4x7 + x —6 
= —________—__-; d ---- “ ς΄... 
() 4+ sin x — 2cosx eg) (x —  )ὰ τ 4 ’ 


2x + sin x for x ~nn/2 


(ε) f=) μ2 -ἘἸἰ 
2n? + 3 
Give examples of functions of the following type: 
(a) continuous everywhere except at x = 1 and x = 2; 
(b) discontinuous at the points x = nz with n an integer; 
(c) continuous everywhere but neither purely algebraic nor purely trigo- 
nometric; 
(d) continuous everywhere except at x = 1, where the left-hand limit is —1 
and the right-hand limit is 3; 
(6) continuous everywhere except at x = 1, where the left-hand and right- 
hand limits both equal 2. 
Suppose it is known that a function f(x) is continuous over the interval 
xo <x < xe, and that f(x0) = yo, f(x1) = γι and f(x2) = ye. Explain why 
it is reasonable to assume that when the functional values yo, yi, and ye are 
reasonably close together, f(x) may in some sense be represented by the 
expression 
roe (x — x1)(% — x2) , (x — xo)(x — x2) 
(xo — x1)(xo — x2) (x1 — xo)(x1 — Χο)" 


for x = μπ|2. 


yl 


(x — Xo)(x — x1) 
(x2 — χο)ίχε — χα)" ᾿ 
Any formula such as this, from which the behaviour of a function over an 

interval is inferred from its behaviour at specific points in that interval, is 
called an interpolation formula. This particular one is called the three point 
Lagrangian interpolation formula and we shall see later that it gives exact 
results when applied to any linear or quadratic function f(x). Considering 
y miley x for Ὁ < x < 3m, explain how this formula might give misleading 
results. 
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3-32 Apply the expression given in Problem 3-31 to the function y = sin x, taking 
as the points xo, x1, and x2 the respective radian arguments 0-6, 0-9, and 1-2 
and so find the appropriate three point Lagrangian interpolation formula over 
the interval 0-6 < x < 1:2. Use your result to deduce approximate values for 
sin 0-8 and sin 1:1 and compare these with the exact tabulated values. 


3:33 Repeat the previous problem, but this time take xo = 0-4, x1 = 1:2, and 
x2 = 1:7 and deduce approximate values for sin 0-9 and sin 1.5. Compare 
your results with the exact tabulated values. 


3-34 Consider the continuous function f(x) defined on the interval [0, 2] by the 
rule f(x) =x for O<x<1 and f(X) =2-—~x for 1<x<2. Taking 
xo = 0-2, x1 = 0°8, x2 = 1:3, apply the expression given in Problem 3-31 in 
order to find an interpolation formula over the interval (0:2, 1-3]. Compare 
the approximate and exact values at x = 0-5, 0-7, and 1-0. 


3:35 The density of the-material of a rod of length L is a function /(x) of the distance 
x measured from one end. Describe in physical terms, rods that are char- 
acterized by the following functions f(x): 


(a) f(x) = constant forO <x < L; 
piforO <x «- &L 

b = 

(2) f@) " for ξζ -χ <L; 

(c) f@) = pL -- kx) OSX <L. 


3°36 If the function f(x) has the same meaning as above, specify the functional 
forms it must take in order that it describes: 


(a) a rod of length L having constant density pi over half its length and a 
density that changes steadily (that is, linearly) with distance from p1 to 
p2 over the remaining half of the rod; 

(b) a rod of length L comprising three sections of equal length with constant 
densities pi, p2, and ps3 in each section; 

(c) a rod of length L having a density that increases quadratically with x 
(that is, like the square of x) from p1 atx = Oto ppatx = ἢ. 


Section 3.5 


3:37 Let f(x, y) denote the density of the material at the point (x, y) of a thin flat 
plate in the (x, y)-plane. Give the functional forms of f(x, y) in order that it 
should describe: 


(a) a circular plate of radius R centred at the origin, with the material to the 
left of the y-axis having a density pi and the material to the right a density 


p2; 

(b) a circular disc of inner radius R and outer radius 3R in which the density 
is constant and equal to p out to a circle of radius 2R, after which it 
decreases linearly to the value 4p at the outer edge of the disc; 

(c) an isosceles triangle with its apex at the origin and sides of length L 
lying to the right of the y-axis and inclined at angles ἔπ and -- ἔπ, respec- 
tively, to the x-axis, with the material above the x-axis having a density 
ρι and the material below the x-axis having a density pe. 


3-38 Let point P have the Cartesian coordinates (1, 1), and let Mi denote the unjt 
circle drawn with P as its centre. Define Nr to be a circle, concentric with Ni, 
and let us agree to write Nr+1 © Ν᾽ if the circle N;+1 is contained within the 
circle Ny. Then Nr+1 © Nr, for all r, describes a family of neighbourhoods of 
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. 3.39 


3:40 


3:41 


3:42 


3:43 


the point P. Give examples of families {N,} of neighbourhoods of P that: 
(a) have the property that lim (radius of Nr) > ἃ; 
γ-- 0 


(b) have the property that lim (radius of Nr) —> 0; 
(c) have the property that; area N;+1 = 3 area N; and lim (area of N,) —> 3. 
r—> 00 


State the largest neighbourhood about the stated points P in which the 
following functions are defined. Also state if they are defined at P and on the 
boundary of the neighbourhood: 


(a) f(x, y) = Wi{xyQx -- I)(y + 2)(x + τὴν — 2)} taking point P as (— 1, 2); 


1 : 
(0) f(x, y) = ee taking point P as (0, 0); 


| : : 
(c) f(x, y) = ΠΕ Ἐπ taking point P as (2, 3). 
Determine these limits when they exist: 
; 3x%y _  2x*-+xy+1 
] τιν Δ ὦ ee iat Aaa ΚΕΙ͂Σ, re 
(a) ne 2x? + 2y2 + 1’ eye x2 + Qxy + y?’ 
μ-ς2 y—2 
-- 1) 5] i 
Pin (y — 1) sin as (a) lim 1 + 2cos xy + sin xy 
r—->2 x* - 4 at 2 + XV 
yl y—}n 
Give examples of functions f(x, y) having these properties: 
(a) lim f(x, y) = 2; (b) lim. f(x, y) = 0; 
a—>1 r— hn 
Μ--Ὸ yh 


(c) lim f(x, y) does not exist. 
α---2 


y>~3 
Find the points or lines of discontinuity of these functions: 
Ὁ for x? + γῆ = 1 


(a) fy) = renee ae elsewhere; 


l—x?—y 
3forx=1,y=2 
(b) f(, y) = xy 
Se a 
x3 + 2xyp + 1 
Cie) 


_ x*siny + y? sin x + 2 
DLO) = aE Taye aT 

Let P and Q be any two points in the (x, y)-plane. Prove that if the distance 
function p(P, Q) is taken to be the length of the straight line joining P to Q 
then: 

(a) p(P, 9) => 0; 

(b) p(P, Q) = 0 if, and only if, P. = Q; 
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(c) p(P, 9) = PQ, P); 
(d) p(P, R) < ρ(Ρ, Q) + p(Q, R), where R is another point distinct from 
P and Q. 


3:44 Repeat the proof of the previous problem, but this time let P, Q, and R be 
points in space. 


Section 3-6 
3-45 Apply the results of Section 3-6 to determine these limits: 


Ι-- 4’2cosx 


6) tin Fa oa © Ἀν Vesa = Tar 
| h) -- ig cel 
je yi eo. 
h—»0 h x0 x 
. | sin x | 
6) lim ——. 
©) τοῦ x 


3:46 Apply the results of Section 3:6 to determine these limits: 


ὦ fim οὐ + he + 1) (AED 55), 
h—0 


h / 
(b) lim Sr Sap (c) lim er 
za x —a vol \SIN απλ 
] ; x2 — x + 4\ (sin 20 fe 
᾿ noe Vs ; x2—x+4 
᾿ =e [" ᾿ Ὁ ie E —x+ ἢ ' 
χ-- 2 [sin ϑ( — 2)]/(x — 2) 
f) hi - 
( ) oe (= ποῦ ] 


3:47 If A(x) is a function for which lim A(x) = 0, use Theorem 3.6 to justify writing 


τ-τντῷ 


lim (1 ict A(x) UU] = e, 


th 


3-48 Let functions f(x) and g(x) be such that lim f(x) = 1 and πὶ g(x) > 9, so 
ret x 


that we may write f(x) = 1 + h(x) where τὺ h(x) = 0. Then, considering 
the function [f(x)]9?, use the result of pieblent 3-47 to show that 


lim λα) g(r) 
lim [f(x)]9) = et 
tw 


3°49 Use the result of Problem 3:48 to determine these limits: 


Ξ 1\7 x — 17 
] 1—-}]; b) | : 
ὦ fim ( - eee ἘΞ | : 


x τ 
i ᾿ ; : 171 
(c) lim (A) : (d) un + sin 2x) ᾿ 


LB Ὁ 
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3-50 Determine the following limits which do not necessarily require the result of 


Problem 3-48: 
x 
(a) lim [! + =) : 
χ-- 00 x 


᾿ 1 \(4%+3)/(@+ 2) 
0) lim [- 3 
(©) tim (53 


(6) lim (cos x)1/*; 
x—0 


2\=" 
(b) lim ( ἘΞ : 


t—> CO 


(d) lim (cos x)?/*"; 
τοῦ 


sin 3x\ (7+ δ) (25: 3) 
f) lim 


Complex numbers and 
vectors 


4.1 Introductory ideas 


A number of important properties of the real number system have already 
been considered, and we shall now examine to what extent quantities repre- 
sentable as displacements in space may be incorporated into a number 
system. The name vector quantity is reserved for all quantities that are 
representable as a displacement in space or, more exactly, as a directed line 
element. Familiar vector quantities are force, magnetic field and velocity, 
which are all representable by a line whose length is proportional to their 
magnitude and whose direction is parallel to the direction of the original 
quantity. In addition, the line of action of a vector has a sense associated 
with it, which means that we must specify a direction along the line to 
indicate the way in which the vector acts. 

Thus to represent a velocity of 3 ft/s in an easterly direction we would 
first adopt a convenient length scale, say 1 in to represent 1 ft/s and then, 
after marking the points of the compass on our paper, we would draw a line 
3 in long in an east-west direction. Finally we would add an arrow to the 
line pointing eastwards to indicate the sense of the velocity. This line could 
be located anywhere on our paper since it does not represent a velocity that 
is associated with any particular point. Reversal of the arrow would corres- 
pond to a reversal of the direction of the velocity, so that the line would then 
represent a velocity of 3 ft/s in a westerly direction. 

Not all quantities are vectors, and another important group are called 
scalars. The word scalar describes any quantity that has magnitude but no 
direction. Typical scalar quantities which have units are temperature, mass 
and pressure. The real numbers are themselves scalars, and are used to describe 
the numerical magnitudes of both scalar and vector quantities, irrespective 
of whether units may be involved. The terms scalar and vector describe 
collectively two important groups of quantities in the real world. It should, 
however, be added that they do not jointly give a complete description of 
all possible physical quantities. Others exist that are neither scalar nor vector, 
though this need not be elaborated here. 

In giving meaning to the square root operation when applied to negative 
numbers, we shall see that a special kind of two-dimensional vector arises. 
Its value in mathematics has proved to be so great that although such vectors 
are restricted to describing vector quantities in a plane, they have been given 
a special name, complex numbers. Because of this restriction, in addition to 


ad 
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studying complex numbers, we shall need a more general theory of vectors so 
that we can describe the cited examples of vector quantities, and any others 
that may arise, in all possible situations and not just in a plane. 

Despite this limitation of complex numbers, their vector properties are 
still important enough in special situations for them to be in this chapter. 
Their value elsewhere in mathematics however is even greater, and makes 
them a discipline in their own right. The main reason for this is to be found 
in their relationship to real numbers and in the consequences of their intro- 
duction into functional relationships in the roles of independent and dependent 
variables. This latter aspect will be pursued later when we discuss another 
valuable geometrical idea, a conformal transformation. In the meantime we 
shall develop the vector properties and algebra of complex numbers to the 
point of general usefulness in mathematics, postponing until the end of this 
chapter the alternative approach that is necessary for study of general 
three-dimensional vector quantities. As already mentioned, each is valuable as 
a separate discipline, though, as would be expected, each has a separate 
notation and, generally, a quite different field of application. 

The following introduction to complex numbers is based only on a 
knowledge of elementary trigonometric identities, and not until after more 
study of the exponential and trigonometric functions will we unify our 
treatment of these two topics. 

The origin of complex numbers was the desire of eighteenth-century 
mathematicians always to be able to compute the roots of polynomials, 
even when they are of the form 


x2 = —1]. (4.1) 


It was Leonhard Euler (1707-83) who first recognized that the real number 
system was deficient in respect of admitting solutions to all possible poly- 
nomials and, in connection with Eqn (4-1), he proposed that a new number i 
be introduced to extend the number system. In keeping with the mathematical 
beliefs of that period, he called i the unit imaginary number and related it to 
real numbers by requiring that 


i2 = —1, | (42) 


If we allow the use of this new symbol, then i = ν΄ -- is the positive 
Square root of minus one, whence Eqn (4-1) may be seen to have the two 


roots x = iand x = —i. That x = iis a root follows from the definition of i, 
whilst x = —i is also a root since (—i)? = (—1)2.i2 = 1.12 = —1. With 
the introduction of i, equations such as 

x2 = —k, 


which are slightly more general than Eqn (4-1), can also be solved. The 
equation may be re-expressed in the form x? = k .(—1), showing that its 
roots are x = ἱνίζ and x = —iv/k, where the positive square root is always 
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taken. For example, if x? = —9, then the roots are x = 3i and x = —3i. 

The success of Euler’s idea lies in the fact that only this one new number 
need be introduced to enable solutions to be found to all polynomials, irre- 
spective of their degree. As a first step towards seeing this, consider the 
quadratic equation 


ax? + bx +c = 0, (4:3) 


and suppose that δ — 4ac < 0. Then, setting 4ac — b? = m?, and formally 
applying the usual formula for the roots of a quadratic, we obtain 


gr leva or x= (5) εἰ} 


24 2a 2a 
Hence, denoting the two roots by x; and x, they take the form 
—b _{m —b _[m 
x1 = (=) +1 (=) and x2 = (=) —l1 (2) (4-4) 


The numbers x; and x2 are not ordinary numbers since each comprises the 
sum of a real number and a multiple of the unit imaginary number i. On this 
basis it is reasonable to conjecture that each root of any arbitrary polynomial 
will be of the same form and, should the multiplier of i be zero, that root 
will reduce to a real number. 

This conjecture is correct, but before we may verify it, we must see how to 
perform arithmetic on numbers of this special type. These are the complex 
numbers already mentioned and, henceforth, we shall always refer to them 
by this name. Unless the exact form of a complex number is needed, it is 
useful to denote it by a single symbol, usually z, so that an arbitrary complex 
number z is of the form 


ee. (4:5) 


where x and y are real numbers. We call Eqn (4:5) the real-imaginary form 
of a complex number, and refer to x as the real part of z, and to y as the 
imaginary part of z. In symbolic form we write 


x = Rez, y= Im z. (4-6) 


Hence if z = 4 — 7i, then Rez = 4 and Imz = —7. We stress that Rez 
and Im z are real numbers. The zero complex number is denoted by 0 and 
represents the number z = 0 + 7.0. 

Already, and without proper justification, we have attributed some 
reasonable arithmetic properties to i. We have, for example, assumed results 
such as ai = ia for all real α, and γ΄ --α = +/—1. νία = ἱνία. To proceed 
logically and rigorously it would be necessary to define addition, subtraction, 
multiplication, and division for complex numbers and then to examine the 
applicability of the real number axioms of Chapter | in the case of complex 
numbers, This is necessary since whatever the arithmetic laws we now propose 
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for complex numbers, they must obviously be in agreement with the real 
number axioms of Chapter 1, whenever the imaginary parts of complex 
numbers are zero. We shall not in fact justify the complex number axioms we 
now formulate, since this is a straightforward matter and provides good 
exercise for the student (see the problems at the end of the chapter). Instead, 
we simply summarize the results, pausing only to discuss in detail the most 
basic operations necessary for the manipulation of complex numbers. 


4.2 Basic algebraic rules for complex numbers 


First we shall agree to denote addition and subtraction of the complex 
numbers z; and zg in the usual manner by writing Ζι + ze and zi — Ze, 
respectively. Multiplication of the complex numbers z; and ze will be denoted 
by juxtaposition thus, z1Zz2. Before going on, and in order to work with 
equations, we must define the meaning of equality between two complex 
numbers, and then we can define the operations of addition, subtraction, and 
multiplication. The following definitions are all phrased in terms of the 
arbitrary complex numbers 21 = a + ib and zz = c + id. 


DEFINITION 41 We shall say that the two complex numbers z; and zg are 
equal, and will write z; = ze if, and only if, a= c and ὁ = 4. That is if, 
and only if, their real parts and their imaginary parts are separately equal. 


Example 41 Of the complex numbers 21, Z2, and z3 defined by Ζι = 3 — 21, 
zg = 1+ 3], and z3 = 3 — 2i, it is obvious that z1 = z3 but that z1 4 ze 
and Ζ3 + Ze. 


DEFINITION 4.2 By the sum z, + Ze will be understood the single complex 
number which written in real-imaginary form has a real part that is the sum 
of the real parts of Ζι and ze, and an imaginary part that is the sum of the 
imaginary parts of z1 and ze. Thus for the stated numbers z; and zz we have 


Z+z2=(at+c)+i(64+ a). 
Example 4.2 If z1 = 2 + fand ze = | — 3/, then Ζι + zz = 3 — 21. 


DEFINITION 4.3. By the difference z1 — zz will be understood the single 
complex number which written in real-imaginary form has a real part that 
is the difference of the real parts of z; and zz and an imaginary part that is the 
difference between the imaginary parts of Ζι and ze. Thus for the stated 
numbers Ζι and Ze we have 


ν᾿ 


Example 4.3 If z1 = 5 + 6i and zz = 4 — 2i, then Ζι — ze = 1 + 8i. 
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Using these definitions it is easily verified that axioms Α.1 to A’5 of 
Chapter 1 also apply to complex numbers. To proceed to an examination of 
the other axioms we must define the operation of multiplication. 


DEFINITION 4.4 The product z1z2, in which z; = a + ib and ze = c + id, 
is a single complex number which may be written in real-imaginary form. 
The product is carried out algebraically as would be the ordinary product 
(α + B)(y + δ), and the final result is obtained by making the identifications 
a =a, β = ib, γ =c, ὃ = id and using the result i? = —1 to combine the 
four terms that result into a real part and an imaginary part. Thus we have — 


z1Z2 = (a + ib)(c + id) = ac + iad + ibc + ibd = (ac — bd) + i(ad + be). 


Example 44 If z; = 2 + 3i and ze = 1 — i, then z1zz2 = 5 + i. As a more 
difficult example let us express (1 + #)4 + (1 — i)4 in real-imaginary form. 

Now (1 + i)4 = (1 + δὲ + 67? + 413 + 13) and (1 — ἢ = (1 — 47 + 6/? 
— 413. + 14), but as i2 = —1, i? = —i, and i4 = 1, these expressions become 
(1 + i)4 = —4 and (1 — ἢ = -- 4. Hence (1 + 4+ (1 — 14 = —8. 


The definitions of addition, subtraction, and multiplication of complex 
numbers are used in the obvious manner for the solution of simple equations. 
Thus, if 22 -- (2 + i) = 4 — 3i, then adding (2 + ἰὴ) to both sides of the 
equation gives 22 + 0 = (4 — 3i) + (2 + i) or 2z = 6 — 2i whence z = 3 — i. 

In all cases, the reader should memorize the method employed in the 
definitions, and not the quoted formulae. | 

With this definition of multiplication it is a simple matter to verify that 
axioms M:1 to Μ'4 and also axiom D-1 apply to complex numbers. When 
one of the numbers 21 or zz reduces to a real number, then the real and 
imaginary parts of the other are both scaled by the same factor. If the scale 
factor is —1 the sign of the complex number is reversed. To discuss axiom 
Μ'5 and division we need to proceed more carefully. 

As it stands, an expression such as (a + ib)/c is well defined as a complex 
number, for we may regard (1/c) as a multiplier of (a + ib) and, provided 
c #0, Definition 4-4 will give the result. In this case a and 5 are both scaled 
by the factor (1/c). However, it is not clear that the more general expression 


_ Ζ  atib 
zo c+ id 


Z3 (4-7) 
is reducible to a complex number expressible in realimaginary form. The 
key to this problem is to be found in Μ'5 itself when we recall that division 
is really defined as the operation inverse to multiplication. Hence, we must 
rewrite Eqn (4-7) in the equivalent form 


z3(¢ + id) =a + ib, (4-8) 
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and then try to determine z3. Now it is easily verified that any complex 

number « + if when multiplied by the associated complex number « — if 

gives the real number «? + £?. Hence, if both sides of Eqn (4-8) are multiplied 

by (c — id), the multiplier of z3 will simply become the real number c? + d?. 
Carrying out this operation, Eqn (4-8) takes the form 


za(c? + d*) = (a + ib)(c — id) (4-9) 
whence, dividing by the real number (c? + 43), we find that 


‘ (ac + bd) + i(bc — ad) 
4 ΞΞ OO 


e+ 4 το 


Equation (4:10) is now in the real-imaginary form of a complex number and 
is the result of the quotient (4-7). Many books take expression (4-10) as the 
formal definition of the quotient (4-7). The definition we shall propose shortly 
is equivalent to Eqn (4-10) in all respects, but its form is much easier to 
memorize. The simplification is achieved by the introduction of a new and 
useful operation called forming the complex conjugate of a complex number. 


DEFINITION 45 If z =a + ib is an arbitrary complex number, then the 
complex number Z = a — ib is the complex conjugate of z. The symbol Z 
is read ‘z bar’. Equivalently, we may state that the complex conjugate of a 
number is always obtained by changing the sign of the imaginary part of 
that number. 


With this definition in mind it is easy to show that the following definition 
of the quotient z1/z2 15 equivalent to Eqn (4-10). 


DEFINITION 4:6 (division) The quotient zi/z2 of the two complex numbers 
z1 and ze is the complex number (z12Z2)/(Z272). 

Using this definition it is a straightforward matter to verify axiom Μ’5 for 
complex numbers, provided only that ze τέ 0. 


Example 45 We illustrate division by setting Ζι = 2 + ij and zz = 3 — 2i. 
Now Ze = 3 + 2iand z1/z2 = (z122)/(zeZ2) = (2 + i3 + 21/3 -- 2)GB3 - 24), 
whence 21/z2 = (4 + 7i)/13. By this same method, an equation of the form 
22(2 +i)=1+i is seen to have the solution z=(1 + i)/(4 + 2i) 
= (3 + 1)/10. 


On account of the fact that Z is an ordinary complex number, its general 
properties are exactly the same as those of any other complex number. Hence 
the number axioms that apply to z, apply equally well to Ζ. The following 
specially’ useful results are easily proved, and are related to the arbitrary 
complex number z = x + iy, to its complex conjugate Ζ = x — iy and to 
the real number |z| associated with z and defined to be |z| = (x? + y2)?. 
(See Definition 4-7.) 
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Zo 2 = 2: REZ = 2X: 


z—2Z=2ilmz = 2iy; 
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(2) = (Z)"; 
“| _ [il 
Ζο " Zl” 


See seer es ee ee ee 


Z1Z2° " ° Zn = ΖηΖο" * * Zn. 

We now utilize some of these simple properties of the complex conjugate 
Operation to prove an important theorem concerning the roots of a poly- 
nomial, and shall then deduce three very useful corollaries. In the process of 
doing so, we shall take as self-evident the fact that a polynomial P(z) of 
degree n has n factors of the form (z — @). These are called /Jinear factors 
because they are of degree 1. The numbers ¢ may, or may not, be complex. 
THEOREM 4:1 If the nth degree polynomial 

P(Z) = aoz™ + ayz™ 1 +--+ ++ ay 
has its coefficients ao, @1,. . ., Qn real, thenif z = ζ is a zero of P(z), so also 
is z = € a zero of P(z). 
Proof Suppose that z = ¢ is a zero of P(z). Then by definition 

αοζ" +aycr1+--++an=0. 
Hence, taking the complex conjugate of this equation we may write 

(ago + all $+ Fay) =0. 


However, the complex conjugate of a sum is the sum of the complex con- 
jugates of the individual terms comprising the sum so that 


(ao® + age} ++ + - + an) = agtn + atl Ὁ - τ. +b Gin. 


Now as the ay, r = 0, 1,. . ., ἡ are real, it follows that ὦ, = ὦ, and so 
arnt — Ay CRT = a,(€)2-7, for r= 0, l, ΠΥ ἡ 
Hence, 


anh" + afl +--+ + an =0; 
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showing that P(£) = 0. Thus z = Z is also a zero of P(z). 


Paraphrased, Theorem 4-1 asserts that if a polynomial with real coefficients 
has complex zeros, then they must occur in complex conjugate pairs. 

As any zero which is not complex must be real, it follows that we may 
formulate a Corollary to Theorem 4-1. 


Corollary 4:1 (Α) If a polynomial has real coefficients, then those of its 
zeros that are not real, occur in complex conjugate pairs. 


If z= ¢ and z = € represent any pair of complete conjugate zeros in 
Theorem 4-1, then(z — ¢)and(z — ἢ) must both be factors of P(z). Hence their 
product (z — £)(z — ὦ) must also be a factor. Now 


(Z—Oz—-Q=2—(C+ Oz+ {ξ, 


and 85 ζ + € =2Re ζ [5 areal number and ζζ = |¢|? is also a real number, 
it follows that the pair of complex conjugate zeros correspond to a single 
quadratic factor with real coefficients. Hence Corollary 4-1 (a) may be 
re-phrased thus: 


Corollary 4-1(b) Any polynomial with real coefficients may always be 
factorized into a set of factors which are linear or at most quadratic, each of 
which has real coefficients. Specifically, if the polynomial is of degree n and 
there are m pairs of complex conjugate zeros, then there will be (7 — 2m) 
linear factors with real coefficients and m quadratic factors with real 
coefficients. 


Finally, as an obvious consequence of this last corollary: 


Corollary 4:1 (c) An odd degree polynomial with real coefficients must 
have at least one real zero. 


The significance of these results is best illustrated by an example which 
shows how they may often be used to simplify a difficult problem to the 
point at which the solution may be determined by familiar methods. 


Example 46 A polynomial P(z) of degree 5 is defined by the relationship 


P(z) = 25 + 5z4 + 1028 + 10z2 + 92 + 5. 


Given that z =i is a zero, deduce the remaining four zeros and use the 
result to express P(z) as the simplest possible product of factors having real 
coefficients. | 


Solution First, as the coefficients of P(z) are all real, Theorem 4-1 is applic- 
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able. Hence if 7 = 7is a zero, then so also is z = —i. Thus (z — i) and (z + ὃ 
are factors, as is their product (z — i)(z + i) = z2+ 1. Using ordinary 
long division to divide P(z) by (z2 + 1) we find that 


P(z)/(z2 + 1) = ζϑ + 522 + 9z 4+ 5. 


Hence to find the remaining factors we must now factorize this cubic poly- 
nomial. As the degree is odd, and the coefficients are real, Corollary 4-1 (c) 
applies showing that it must have at least one real zero. At this point we have 
recourse to trial and error to find the reaJ zero which for the purposes of this 
example has been made an integer. 


Thus, setting 
O(z) = φῇ + 5z2 4+ 9z 4+ 5, 


we must find a value z = z; such that Q(z) = 0. By inspection we see that 
Q(—1) = 0 showing that the real zero is z = —1. This corresponds to the 
linear factor with real coefficients (z + 1). Removing the factor (z + 1) 
from the cubic by long division, we then find that 

— Pe!) Q(z) = 7* + 47 + 5, 

(2+ I)z2¢+1 @+) 
Finally we apply the standard formula for the roots of a quadratic to this 
expression to obtain the remaining two zeros. Completing the calculation, 
these are found to be z = —2 —iand z = —2 + i. Thus the five zeros are 

=i, z= —i, z= --ἰἴἰ, z= —2—i, and z= —2+4i. The required 

factorization is 


P(z) =(z + U(z? + 1)(2 + 42 + 5). 


4.3 Complex numbers as vectors 


So far we have discussed the basic arithmetic of complex numbers but have 
not mentioned their vector properties. To do this, and to give a geometrical 
representation of complex numbers, we plot them as points in a plane called 
the complex plane or, sometimes, the z-plane. Specifically, we shall use the 
real part of the complex number as its horizontal or x-coordinate and the 
imaginary part of the complex number as its vertical or y-coordinate. Thus 
to each complex number there corresponds just one point in the complex 
plane and, conversely, to each point in the complex plane there corresponds 
just one complex number. The relationship between points and complex 
numbers is one-one. In the complex plane, the x-axis is the real axis and the 
y-axis is the imaginary axis. Other accounts of this subject often refer to 
this geometrical representation of complex numbers as the Argand diagram, in 
honour of its inventor. 
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ta 


Complex-plane — | 2 __Complex- 


Fig. 41 Representation of complex numbers: (a) point representation; (b) vector 
representation. 


In the complex plane, a complex number may either be considered as a 
point in the plane or, equivalently, as the directed straight line element from 
the origin to the point in question. We shall remember this dual relationship 
between points and vectors but, for simplicity, will usually speak only of 
points in the complex plane. 

This duality between points and vectors is indicated in Fig. 4-1 where the 
complex numbers z= 1, z=i, 2=2+i, z=2—i, and z= —1 — fi 
have been represented as points (Fig. 4-1 (a)) and as vectors (Fig. 4-1 (b)). 
In the case of the vector representation, arrows have been added to show that 
the vector is drawn from the origin to the point in question. 

Notice that if a number, together with its complex conjugate, are plotted 
in the complex plane, as for example 2 + i and 2 — jin Fig. 4:1 (a) and (b), 
then geometrically, in both the point and the vector representations, one is 


obtainable from the other by reflection in the x-axis as though it were a 
mirror. | 


Instead of adding and subtracting vectors analytically by use of Definitions 
4-2 and 4-3, the same result may be achieved entirely geometrically as we now 
indicate. Consider the sum of the vectors z3 = 2 +i and zo = 1 - 2]. 
Analytically z1 + ze = 3 + 3i, and Fig. 4-2 (a) shows this result. The same 
result may be obtained geometrically by the following construction. If we 
wish to add vector zz to z1, then for the purposes of addition we shall imagine 
vector zz to be freed from the origin, so that it is capable of translation any- 


plane. 
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(b) 


Fig. 42 Algebraic operations with complex numbers: (a) vector addition: z1 + za; 
(Ὁ) vector subtraction: z1 — 22. 


where in the complex plane, but we shall assume that wherever we re-locate 
it in the complex plane it will always be kept parallel to its original position, 
and its length and sense will be preserved. The result of adding ze to 21 Is 
then achieved by translating zz in the manner described until its origin is 
located at the tip of vector Ζι. The two arrows of vectors z; and Ze then point 
in the same direction, and the vector z; + Ze is the line element directed from 
the origin 0 to the tip of the vector z2 in its new position. In Fig. 4-2 (a) this 
construction is represented by the lower triangle comprising the parallelogram. 
Such triangles are vector triangles. 

A vector not attached to a specific origin or one which, for the purposes 
of combination with another vector, is freed from its origin to be re-located in 
some other part of the complex plane will be called a free vector. This is in 
contrast to a vector that is attached to a definite origin which we shall call a 
bound vector. In the addition of zz to Ζι that we have just performed, z; was 
regarded as a bound vector and ze as a free vector. 

Notice that by the same argument, zi may be freed and its origin trans- 
lated to the tip of the bound vector ze to form the vector zg + Ζι, which is 
the line element directed from the origin to the tip of vector z1 in its new 
position. In Fig. 4-2 (a) this construction is represented by the upper triangle 
comprising the parallelogram. The fact that both constructions give rise to 
the same line representing on the one hand z; + Ze, and on the other zp + 21, 
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proves that vector addition is commutative, since z1 + ze = ze + Ζι. 

Before proceeding with the discussion of subtraction, we first observe that 
Definition 4-4 implies that multiplication of the bound vector z by —1 
reverses its direction. That is to say its origin remains fixed, but the line 
element representing the vector is rotated about the origin through the angle 
7. With this remark in mind we see that subtraction of vector zo from 21 
(Fig. 4-2 (b)), is just a special case of addition in which the vector to be added 
is —Z. The vector —Ze is obtained from ze by reversing the direction of ze, 
as is indicated in Fig. 4:2 (Ὁ) by the dotted line directed into the fourth 
quadrant. The vector zi — zg is then the line element directed from the origin 
to the tip of the reversed vector zg in its new position. In Fig. 4-2 (δ) this 
construction is shown in the right-hand half of the plane. The same construc- 
tion, with the roles of Ζι and zg interchanged, is shown in the left-hand half 
of the plane and when compared with the first result proves that z1 — ze 
= —(ze — 21). (Why?) 

Thus far, complex numbers have been seen to obey the addition, multi- 
plication, and distributive axioms of real numbers, and the reader might be 
forgiven for wondering if there is any significant difference between them and 
the real numbers. The answer is yes. Whereas real numbers can be given a 
natural order according to their size, complex numbers cannot. A glance at 
Fig. 4-1 (b) makes it clear that no natural order exists in the field of complex 
numbers, comprising all numbers in real-imaginary form, since even vectors 
of the same length may be differently directed, for instance the pairs of vectors 
1 and i, and 2 + i and 2 — i, Whereas it makes sense to order the lengths 
of vectors, since these are scalar quantities and may be so ordered, the vectors 
themselves have no natural order. To further our argument we now name the 
length of a vector and introduce a notation whereby it may be manipulated 
in equations. 


Fig. 43 Modulus and argument representation. 
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DEFINITION 4-7 (modulus of a vector) The quantity 


is called the modulus of the vector z = x + iy. It is the length of the line 
element drawn from the origin to the point (x, y) in the complex plane (see 
Fig. 4-3). 


Example 47 If z = 3 + 4ϊ, then [2] = (3? + 4?)1/? = 5. 
Notice that in the special case Im z = 0, |z| reduces to the absolute 


value of a real number since, as always, the positive square root is involved 
in the definition. The following useful results are easily verified: 


zz = |z|?; |Z1z2| == |zi| . |ze|. 


If either the upper or lower triangles comprising the parallelogram in 
Fig. 4-2 (a) are considered, then clearly, when expressed in terms of the 
modulus, the Euclidean theorem ‘the sum of the lengths of any two sides of 
a triangle exceeds the length of the third side’ becomes the following inequality 
relating modull: 


\z;| + |z2| oe \Z1 + Z|. (4-11) 


Equality will occur only when z; and zg are collinear. For obvious reasons 
Egn (4-11) is called the triangle inequality, and it has already been encountered 
in simple form when we discussed the absolute value of the sum of two real 
numbers. An analytic proof of result (4-11) is set as a problem at the end of the 
chapter. 

Another useful inequality relating the moduli of the complex numbers 
Ζι and Ze is 


]Ζι + ze| > [{Ζ1] — [zal], (4-12) 


where again equality occurs only when z; and ze are collinear. The proof of 
this is also left to the reader as a problem. 


Example 48 If Ζι =3+4i and z2=4+4 3i, then Ζ; +22 =7+ Τὶ. 
Hence |z1| = (32 + 42). = 5, |zo| = (42 + 32)/2 = 5, and |z1 + Za| 
ΞΞΞ (72 + 72\il2 = / 98, so that [Ζῃ] + |z2| = 10 and [|Ζ1| -- [22] = (), We 
have thus verified inequalities (4-11) and (4-12) in this special case, for they 
demand that for any z1 and Ze 

||z1] — [22] < [Ζι + Ζεὶ < |za| + {ze 
which in this case corresponds to the valid inequality 


0 < 1/98 < 10. 
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4.4 Modulus-argument form of complex numbers 


Referring again to Fig. 4-3, we see that the complex number z need not be 
specified in the standard form for it may equally well be specified by giving 
both the value of |z| and the angle 6 which, by convention, is always measured 
positively in an anti-clockwise direction from the x-axis to the line of the 
vector z. The angle 9 is the argument of z and we shall write 6 = arg z. The 
argument of z is indeterminate with respect to multiples of 27, because angles 
6 and 6 + 2kz, where k is any integer, will give rise to the same line on 
Fig. 4.3. Later we shall see that this indeterminacy in 6 plays an important 
role in the determination of the roots of complex numbers. When 6 = arg z 
is restricted to the interval —7 < 0 <1, it will be termed the principal value 
of the argument. 

If we define the real number r by the equation r = |z|, and still set 
θ = arg z, then the ordered number pair (r, 6) describes the polar coordinates 
of the point z in Fig. 4:3. That is, the radial distance of a point from the 
origin together with its bearing measured from a fixed line through the 
origin. The relationship between the Cartesian coordinates (x, y) and the 
polar coordinates (r, 9) of the same complex number z is immediate, since 
from Fig. 4-3 we have 


x=rcos6 y=rsin0 (4-13) 
or, equivalently, 
x y 
r= (x? + pyle cos§ = (x? + γῆι sin 6 = (x2 + γῆ) (4:14) 


Thus the complex number, or vector, z = x + iy may also be written in the 
modulus—argument form 


z = r(cos 6 + isin 8). (4-15) 


Because arg z is indeterminate up to an angle 2k, we must phrase our 
definition of equality between two complex numbers carefully when it is to 
refer to complex numbers expressed in modulus—argument form. 


DEFINITION48 The two numbers z; = r(cos θ + isin 6) and zz =-p(cos¢ 
+ isin @) expressed in modulus—argument form will be said to be equal if, 
and only if, r = p and 0 = ὁ + 2kz. 


Equations (4:13) and (4:14) enable immediate interchange between the 
modulus—argument and the real-imaginary forms of z, as the following 
examples indicate. 


Example 4-9 
(a) Express z = —44/3 + 4i in modulus—argument form; 
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(b) Express z = 2 + 5iin modulus-argument form; 
(c) If |z| = 3 and arg z = —z/10, express z in real-imaginary form. 


Solution (a) From Eqn (4-14), r = |z| = [(—4./3)? + 47]!/2 = 8, whilst 
cos 6 = —(41/3)/8 = —(4/3)/2 and sin 9 = 4/8 = 3, from which we deduce 
that the principal value of 6 must lie in the second quadrant with 6 = argz 
= 57/6. Hence, in modulus—argument form 


= ὃ cos = + isin =| 
(on + iin’ 


Notice that although we could have written 6 = arg z = arc tan (—1/4/3), 
it would not then have been clear in which quadrant θ must lie, and, conse- 
quently, we shall always specify sin θ and cos @ separately. 


Solution (b) Again from Eqn (4-14), r = |z| = (22 + 52)l/2 = 4/29, whilst 
this time cos 0 = 2/4/29 and sin 6 = 5/4/29, from which we deduce that 
the principal value of θ must lie in the first quadrant with 6 = arg z = 1-1903 
rad. Hence, in modulus—argument form 


z = 1/29(cos 1:1903 + isin 1-1903). 


Solution (c) The result is immediate, since Eqn (4-15) gives 


= 2:8533 — 0-9270i. 


We now examine the consequences of multiplication and division for 
complex numbers expressed in modulus—argument form. Let z; and ze be 
the two complex numbers: 


Ζι = ri(cos θι + isin 41) and Zo = ro{cos 62 + isinO2). (4:16) 
Then by direct multiplication we find that 
Z1Z2 == rire[(cos 4; cos 62 — sin θ1 sin 62) 
+ i(sin 61 cos 62 + cos 6; sin 42)], 
and using the trigonometric identities for cos (6; + 62) and sin (61 + 62) 
this may be written as 
Z1Z2 = ryre[cos (01 + 62) + isin (01 + 62)]. (4-17) 


We have thus proved that the result of the product z1z2 is a complex number 
with modulus |z1z2| = rire and argument arg (z1z2) = 6; + 02 = arg 21 
+ arg ze. Thus the result of multiplying two complex numbers is to produce 
a complex number whose modulus is the product of the two separate moduli 
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and whose argument is the sum of the two separate arguments (see Fig. 4-4). 
A special case results if we write 
= cos ἐπ + isin ἐπ. (4:18) 


It follows that in the z-plane, meri paeauon by 7 corresponds geometrically 
to an anti-clockwise rotation through ἐπ without any change of size. To 
illustrate this, the vectors iz; and iz have been added to Fig. 4-4. 


Fig. 44 Multiplication and division; z1z2, z1/Z2. 


By repeated application of Eqn (4-17) it is easily proved that if zm 
= In(cos Om + isin 6m) form = 1, 2,.. .,n, then 


Z122° * * Zn = rire’ + * tlcos (61 + 62 ++ + + + On) 
+ isin (9; + θα +-- ++ 0}}}β. (4:19) 
An argument essentially similar to that which gave rise to Eqn (4:17), 
but this time using the trigonometric identities for cos (6: — 42) and 


sin (61 — 62), establishes that whenever zz + 0, then with the same notation 
we have 


5. Τὶ ἐς ve (61; — 62) + isin (0, — 42)]. (4-20) 
Zo 
Obviously |z1/Z2| = ri/re = |z1|/|ze| and arg (21/zZ2) ΞΞΞ- θ᾽ -- θο = ΑΓΒ 21 


— arg ze. Expressed in words, this says that the result of dividing two complex 
numbers is to produce a complex number whose modulus is the quotient of 
the separate moduli and whose argument is the difference of the two separate 
arguments, | 

A most important special case of Eqn (4:19) occurs when all the 2z;, Ze, 
. . + Zn are equal to the same complex number z = r(cos @ + isin 6), say. 
The result then becomes 


z” = r™(cos nO + isin n§). 
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Substituting for z and cancelling a real factor r”, we obtain the following 
important theorem. 


THEOREM 4:2 (de Moivre’s Theorem) 


(cos θ + isin 0)" = cos nO + isin nO. 


A more subtle argument would have yielded the fact that this remarkable 
result is true for all real values of n, and not just for the integral values 
utilized in our proof. This will be undertaken later when the complex exponen- 
tial function has been discussed. 

Theorem 4:2 provides a simple method by which certain forms of trigo- 
nometric identity may be established. One typical example is enough to 
illustrate this. 


Example 4-10 Let us relate sin 46 and cos 40 to sums of powers of sin 6 
and cos 9. Set ἡ = 4 in Theorem 4-2 and expand the left-hand side by the 
binomial theorem, using the fact that 13 = —1, 18 = —i, i4= 1, etc., to 
obtain 


cost 9 + 4i cos? 6 sin 0 — 6 cos? 6 sin? 0 — 4icos θ sin? 6 + sin4 6 
= cos 46 + isin 46. 


Then, recalling that equality of complex numbers means equality of their 
real and imaginary parts considered separately, we have the two results: 


equality of real parts 
cos? # — 6 cos? 6 sin? 8 + sin‘ 6 = cos 46, 


and 


equality of imaginary parts 
4(cos? 6 sin @ — cos θ sin? 6) = sin 46. 


These are the desired results. It is characteristic of complex numbers that any 
single complex equality implies two real equalities, and even if only one is 
sought the other will be generated automatically. The same method works 
for any positive integral value of m when it will connect sin n@ and cos μθ 
with sums of powers of sin 6 and cos 6. 

We shall return to this idea in connection with the exponential function, 
and show that it is possible to use de Moivre’s theorem to express sin” 6 
and cos” θ in terms of sums involving sin γθ and cos r@. 

Sometimes Theorem 4-2 can be used to reduce the labour of computation 
as now shown. 


Example 411 We shall evaluate z!° where z = 1 + 7. Rather than making 
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repeated multiplications, or applying the binomial theorem, we write z in 
modulus—-argument form as z = 4/2(cos 7/4 + isin 7/4), when we have 
z10 = (4/2)!9(cos 7/4 + isin 7/4). By de Moivre’s theorem this becomes 


5 OTe 
“10 — 95 (co: > + isin =) ξξξ 32]. 


4-5 Roots of complex numbers 


When performing algebra on real numbers the idea of the root of a number 
plays a fundamental part. The same is true when manipulating complex 
numbers, and we now discuss the general ideas involved in determining their 
roots. 

Let p/q be any rational number, where p and g are integers with g supposed 
positive. We shall assume that p and g have no common factor. 


DEFINITION 4:9 We define z?/¢ by saying that: 


Woo zp/@ <> wd = Z?P, 


Let 
w=p(cos@?¢+isingd) and z=r(cos@ + isin 6). (4-21) 
Then from Definition 4-9 and de Moivre’s theorem we have 
p%(cos gf + isin gd) = rP?(cos ρθ + isin ρθ). (4-22) 
Now from Definition 4-8 it follows that 
ρ = rP and qd = pé + 2Κπ, (4:23) 
and so 
6+ 2k 
p = rpiq and d = = (4-24) 
The expressions w = ΖΡ’ thus have the general form 
0 + 2k 0 
Ww a χζρίᾳ = po/¢ cos (==) + isin (-= =) 
q q 
with k an integer. (4-25) 
It is easily seen that only q different values wo, wi, we, . . ., Wg—-1 of w will 
result from Eqn (4-25) as the integer k increases through successive integral 
values. It is usual to give k the Φ successive values k = 0, 1,2,.. ., 4 --- Ἰ. 


If k is allowed to increase beyond the value g — 1, then the numbers wo, w1, 
. +» Wq-1 will simply be generated again because of the periodicity properties 
of the sine and cosine functions. 


‘Example 412 We illustrate the use of Eqn (4-25) by determining the n 
numbers w satisfying the equation w = (1)!/”, For obvious reasons these are 
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called the nth roots of unity. Comparing this equation with the general expres- 
sion w = z?/% that has just been discussed we see that we must make the 


identifications z = 1, p = 1, and g = n. To proceed further we must write 
the number unity in its modulus—argument form 


1=1.(cos0 + isin 0), 


so that comparing this with z in Eqn (4-21) we see that the further identifica- 
tions r = 1 and 0 = Ο must be made. Substitution of these quantities into 
Eqn (4-25) then gives the result 


2k 2k 
ii 668 So sin with k = 0,1,2,...,.”—1. 
n n 


The result of this calculation with n = 5, for example, is to generate the 
fifth roots of unity. In Fig. 4-5 these roots are plotted as the numbers wo, w1, 
. . «. Wa in the complex plane. They are uniformly distributed around the 
unit circle centred on the origin. By making use of the vector properties of 


complex numbers we shall usually represent this circle by the convenient 
notation |z| = 1. (Why ἢ) 


|z| = 21/3 


Fig. 45 Fifth roots of unity. Fig. 46 Roots of ὦ = (1 + 1)’. 


Example 4:13 As a slightly more general example we now determine z?/3, 
when z = | + i. In this case p = 2, g = 3, and in modulus—argument form, 
z = +/2(cos 7/4 + isin 7/4) showing that r = 1/2 and 6 = 7/4. Substitution 
into Eqn (4:25) gives 
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1 + 4k 
w = 208 cos ( = )w+ isin (=) 7] with k = 0, 1, 2. 


The three roots wo, wi, and we are thus: 


== 0): wo = 21/8 " + isi 7) = ua ( ;) 
(k = 0): wo = 2 (cos? + isin 7 Ρ 5 Ἐς 


5 . Ὁ 3. i 
(kK = 1): w1 = 218 (cos ὅν “+ isin 2) = για ( — ΥΞ + 3) 


3 
(kK = 2): we = 21/8 (cos τὰ + isin >] == — 21/3], 


These are plotted in the complex plane in Fig. 4-6, where they are seen to be 
uniformly distributed around the circle |z| = 21/3, 


Example 414 As a final example let us find the roots of the equation 


w= τᾶ 


In terms of the notation of Eqns (4-21) and (4-25), and recalling that we have 
agreed always to take g as positive, we have p = —1, q = 3, and z =i. 
Now in modulus-argument form 


=! i + isin 3 
{Ξε 1. (cos 5 I 5 3 | 
so that r = 1 and 6 = 7/2. Hence, substituting into Eqn (4-25),. we find that 


W = COS pa + isin 4 with k = 0, 1, 2. 


Hence the three roots wo, wi, and we are: 
(Κ = 0): Wo = (cos 7/6 — isin 7/6) = 3(4/3 — i), 
(A = 1): wi = (cos 7/2 + isin 7/2) = i, 
(kK = 2):.we = (cos 71/6 + isin 77/6) = —4(4/3 + i). 


This completes our preliminary encounter with complex numbers, and 
our study will be resumed later in connection with the complex exponential 
function and with functions of a complex variable. The remainder of this 
chapter is devoted to developing the foundations of our study of general 
vectors. 


4.6 Introduction to space vectors 


It is clear that any set of vector quantities that do not all lie in a plane cannot 
be represented vectorially in the form of complex numbers. For example, 
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even the vectors describing the velocity of a vehicle as it is driven at constant 
speed past fixed points on a winding hill could not be so represented. Pair- 
wise these velocity vectors define planes, and so could be represented by 
complex numbers in those planes, though different pairs of vectors would 
define different planes, thereby making any general representation impossible 
in terms of complex numbers. The trouble here is not hard to find. It is that 
complex numbers just happen to be capable of representation as planar 
vectors with their own appropriate descriptive language, and they were not 
developed with general vector representation in mind. In short, they are 
complex numbers first and vectors second: not the other way around. 

To overcome this limitation and to be able to describe arbitrary vector 
quantities we must preserve the idea of a vector as a directed length, but 
re-think its description. This is best achieved using a diagram, so consider 
Fig. 4-7 which depicts the mutually perpendicular Cartesian axes O{x, y, z} 
with origin O. In more mathematical terms we describe these axes as being 
mutually orthogonal. This is a technical term that in a geometrical context 
has the same meaning as perpendicular, though it is often used in a wider 
sense, when the word perpendicular would be inappropriate. Henceforth we 
shall almost always use the term orthogonal. 

The manner of identification of the x, y, and z coordinate axes is not 


C(0, 0,a,) 


— 


re » P(@,,a,,4;) 


a a O(a 1; a, 0) 


ῷ . : ~~» mney 
Fig. 47 Right-handed Cartesian axes. es 74 + AQ 
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arbitrary, but is made in such a manner that they form a right-handed system 
of axes. By this we mean that having assigned axes for the variables-x-and 7, 
together with the directions in which they increase positively, the direction of 
positive z is then chosen to be that in which a right-handed screw. would 
advance were it aligned with the third axis and rotated in the sense x to y. 
This sense of rotation is indicated in Fig. 4:7 by means of a directed spiral 
about the z-axis. In the diagram the y- and z-axes are supposed to lie in the 
plane of the paper with the x-axis pointing out of the paper towards the 
viewer. Later we shall refer to this right-handed property in connection with 
axes which are not orthogonal, when right-handedness Is still to be interpreted 
in exactly the same sense as above. 

This right-handed property of the system of axes 15 shared by each pair of 
axes in turn, provided the senses of rotation are appropriately defined. The 
following table describes the convention that is always adopted. 


Table 41 Right-handed axes 


Rotate R-H screw advances 
From To in direction of positive 
x y 2 
γ Ζ x 
Ζ x y 


The table can easily be remembered in the concise form 
x y Zz 
y 2 “x 


Ζ χα γ 


where the entry in any row is obtained from the entry in the row above by 
transferring the first letter of that entry to the last position. These entries are 
called cyclic permutations of the letters x, y, and z, and further cyclic permuta- 
tions will simply regenerate the table. These rules describe the right-handed 
symmetry of the O{x, y, z} axes. If any two letters in an entry are inter- 
changed, then by the same rule, the negative direction of the third axis is 
defined. Hence the set of letters y x z are to be interpreted ‘rotate from y to 
x to make a right-handed screw aligned with the z-axis advance in the direction 
of negative z’. 

If in the above argument a right-handed screw motion had been replaced 
by a left-handed screw motion, then a left-handed system of axes would have 
resulted. Although a left-handed system of axes is in all respects equivalent to 
a right-handed system for the purposes of vector representation, it is customary 
to work with right-handed systems. 

Let P be the point with coordinates x = a1, y = a2, and z = az illustrated 
in Fig. 4-7. We shall denote it by the more concise notation (a1, 45, a3) where 
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the first, second, and third entries in this ordered number triple represent the 
x, y, and z coordinates, respectively. Then from the point of view of coordinate 
geometry it is the point P that is of interest, whereas from the point of view 
of vectors it is the directed line element from O to P that is of interest. To 
signify that it is the vector quantity that interests us here we shall write OP. 


Notice that by this convention the vector PO is the directed line from P to O 
and is opposite in sense to OP. In future we will denote the length of the vector 
OP by |OP|, which is a scalar, and by definition this length will always be 


positive. 
In Fig. 4-7 the lengths OA = a1, OB = az, and OC = az are called the 
orthogonal projections of OP onto the x-, y-, and z-axes, and a simple applica- 


tion of Pythagoras’ theorem gives the result 
|OP|? = (OA)? + (OB)? + (OC)? 

Or, 
|OP|? = ay? + a2® + as’. we 


Dividing by |OP|? this becomes 


1= (55) + (om) + (on) 
~ IOP} |OP| JOP|) 


which can then be rewritten in terms of the angles 61, 02, 03 as 


1 = cos? 0, + cos? 62 + cos? θ3. (4:26) 
If the numbers /, m, and n are defined by the relations 

[= cos 44, m = cos θο, n = cos 63, (4:27) 
then Eqn (4:26) becomes 

Ι = [2 + m2 + n?, (4-28) 


For obvious reasons /, m, and n are called the direction cosines of OP with 


respect to the axes O{x, y, z} and it is often convenient to write them in the 
form of an ordered number triple as {/, m, n}. The angles 01, 62, and 63 are 
indeterminate to within a multiple of an and, by convention, they will always 
be takerr to tie in the interval [0, 7]. 

Consider the direction cosines /, m, n as defining a point P’ in space with 
coordinates x = ἰ, y =m, and z =n, then, by Pythagoras’ theorem and 
Eqn (4:28), the vector OP’ must have unit length. The direction and sense of 
OP” are the same as those of OP; only the lengths are different. Vectors of 


unit length in given directions prove to be extremely useful in vector analysis 
so they are appropriately called unit vectors. 
Now by definition, the direction cosines /, m, n are proportional to the 
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coordinates a1, a2, a3 of the point P and consequently the numbers ai, do, 
and ag are often called the direction ratios of OP. To convert direction ratios 


to direction cosines it is necessary to normalize them by dividing by the 
square root of the sum of the squares of the direction ratios. This is, of course, 
equivalent to division by the quantity we have agreed to denote by |OP|. 


ee ey 


01, Oa, and 0; of the vector OP, where Ρ 1S the point (, yee 4). 


Solution The direction ratios are 1, —2, 4, and |OP|, which is the square 
root of the sum of the squares of the direction ratios, is 


JOP| = (12 + (—2)? + 42). = 21. 


Hence the direction cosines of OP are / = 1/4/21, m = —2/,/21, and 


n = 4/4/21, from which the angles 61, Oo, and 63 are seen to be 1-351, 2-022, 
and 0-509 radians, respectively. Unless otherwise stated we shall always 
express angles in terms of radians, as here. 


Example 416 Determine the angles of inclination 61, 92, and 63 of a vector 
to the x-, y-, and z-axes, respectively, given that its direction cosines are: 


(a) {ξ, - 312, 0), 
(0) {ξ, 4, ν 11/4}. 


Solution (a) Here ἰτ cos 6; = 1/2, m=cos 02 = —1/3/2, n= οοϑβ 63 
= 0, so that 6; = 7/3, 02 = 57/6, and 63 = 7/2. Hence in this case the vector 
lies entirely in the (x, y)-plane. 


Solution (Ὁ) In this case, / = cos 6; = 1/2, m = cos 62 = 1/4, n = cos 63 
= 4/11/4, so that 61 = 7/3, 62 = 1.318, and 63 = 0-593. 


Example 4.17 Ifa vector has direction cosines {ξ, m, $} deduce the possible 
values of m. If, in addition, it is stated that the vector makes an obtuse angle 
θο with the y-axis determine the value of 62. 


Solution We use Eqn (4:28), setting / = 4 and n = 3 to obtain 
(A)? + m2 + ()? = 1, 
Whence, m2 = 1/2 or m= +1/4/2. These values of m correspond to 


Oo = 7/4 for m = 1/+/2, and to 62 = 37/4 for m = —1/+/2. As the angle 02 
is required to be obtuse we must select 02 = 3π|4. 


The idea of a fixed origin is fundamental to coordinate geometry though 
it proves to be rather too restrictive in vector analysis. This is because it is 
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only the magnitude, direction, and sense of a vector that usually matter, and 
not the choice of origin and coordinate system in which the vector is repre- 
sented. For example, when specifying a wind velocity it is normally sufficient 
to say 20 ft/s due East, without identifying the particular points in space at 
which the air has this velocity. 

In vector work this ambiguity as to the location of a vector in Space is 
allowed by considering as equivalent, any two vectors that may be repre- 


Bare | Ρ'(α,, a, 43) 


Εἰρ. 48 Translation of axes without rotation. 


sented by directed line elements of equal length which are parallel, and have 
the same sense. In Fig. 4-8 we have depicted two vectors OP and O’P’ that 
are equivalent in the sense just defined. Another way to define this equivalence 
is to require that when the axes O{x, y, z} are translated, without rotation, to 
the position O’{x’, γ΄, 2}, the coordinates of P’ with respect to the axes 
through O’ are the same as those of P with respect to the axes through O. 
That is, if P is the point (aj, ao, a3) in the system of axes Οὐχ, y, z}, then P’ 
is the point (a1, az, a3) in the system of axes O'{x’, y’, z'}. Do not get confused 
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by this. If O’ 1s the point (a1, ἀρ, #3) with respect to Οὐχ, y, z}, then coordi- 
nates in the unprimed system are related to those in the primed system by the 
equations x = 01+ x,y =oa2+ y’,andz = a3 + 2’. 

This freedom to translate vectors now enables us to give direction cosines 
to any vector in space and not just to those having their base at O. Suppose, 
for example, that we require the length and direction cosines of the vector AB, 
where A is the point (αι, a2, a3) and B is the point (41, 52, b3) when expressed 
relative to some set of axes O{x, y, z}. Then we see at once that the lengths 
of the projections of AB on the x, y, and z axes are (b1 — a1), (b2 — ag), 


and (63 — 43), respectively. Accordingly, by translating the vector AB until 


A in its new position A’ coincides with O, we see that the tip B in its new 


Fig. 4-9 Translation of a vector. 
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position Β΄ must be the point ((b1 — a1), (62 — 45), (b3 — as)) (see Fig. 4:9). 
Hence [ΑΒ], that is the length of AB, is 

|AB| _ [(b1 - 41)" + (bg — a2)* + (b3 -- a3)*] 1/2. (4:29) 


The direction cosines of AB then follow as before and are 


/ m= , n= (4.30) 


Example 418 Find [ΑΒ] and the direction cosines of the vector AB, if A 
has coordinates (1, 2, 3) a and B the coordinates (4, 3, 6). 


Solution From Eqn (4-29) we see that [ΑΒ] = [(4 — 1)? + (3 — 2)? 
+ (6 — 3)212 = 4/19, whilst from Eqn (4:30) it follows that / =-3/4/19, 
m = 1/1/19, and n = 3/+/19. 


It is now convenient to introduce a triad of unit vectors, denoted by i, j, 
and k, that are parallel to and are directed in the positive senses of the x-, y-, 
and z-axes, respectively. Here we remind the reader that these are called unit 
vectors because they are each of unit length on the x-, y-, and z-length scales. 
Notice that the term right-handed that was applied to the system of axes 
O{x, y, z} also applies to the triad of vectors i, j, k when taken in this order. 
We shall use this idea again later. 

An arbitrary vector in any one of the i, j, or k directions may then be 
obtained by scaling the length of the appropriate unit vector by a multiplica- 
tion factor uw. Thus a vector three times the size of the unit vector i will be 
written 3i, whilst a vector twice the size of the unit vector k, but oppositely 
directed, will be written —2k. 

Returning to Fig. 4-7 we see that in terms of i, j, and k, the vectors OA, 


OB, and OC may be written as 
OA = aul, OB = ae}, OC = agk. 


From our ideas of vector addition in a plane the vector OQ lying in the 
(x, y)-plane is OQ = OA + AQ or, because vectors may y be translated, 
0Q = OA + AB. Now in terms of our unit vector notation this may be 
written 0Q = ai + a2j. Turning attention to the plane containing points O, 
Q, and P, we see that by the same argument OP = OQ + QP. Again, 
because vectors may be translated, QP = OC so that finally, on substituting 
for OQ and QP in the equation OP = OQ + + QP, we obtain 


OP = αιΐ + aoj + azk. (4:31) 


For ease of notation, arbitrary vectors, like unit vectors, will usually be 
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denoted by a single symbol such as a, a, or r. Thus a general point P in space 
with coordinates (x, y, z) will often be written 


r= xi+ yj + zk. (4-32) 


The almost universally accepted convention which we adopt here is to denote 
vector quantities by bold face type and scalar quantities by italic type. 

Because a vector such as that in Eqn (4-32) identifies a point P in space 
it is called a position vector. In the vector representation Eqn (4-31) the numbers 
aj, a2, and 48 are called the components of OP. 


Two vectors will only be said to be equal if, when written in the form 
of Eqn (4:31), their corresponding components are equal. The vector a 
= aii + a2j + ask will be said to be a scalar multiple A of vector b 
= ῥιὶ + boj + bsk, and we will write a = Ab if, and only if, ai; = Ady, 
a2 = Abo, and a3 = Ab3. In the special case A = —1 we have a= —b, 
showing that [8] = |b|, but that the senses of a and b are opposite. Thus in 
Fig. 4.7 we have OP = —PO. 

The zero or null vector 0 is the vector whose three components are each 
identically zero. It 1s often denoted by O instead of 0, since confusion is 
unlikely to arise on account of this simplification of the notation. Following 
on from our first ideas of vectors, and in accordance with the derivation of 
Eqn (4:31), we now define the operations of addition and subtraction of 
vectors. 


DEFINITION 4:10 Let a and b be arbitrary vectors with components 
(ai, a2, a3) and (61, be, bs), respectively, so that they may be written 
a= ai + d2j + ask and b = διὶ + bej + b3k. Then we define the sum 
a + b of the two vectors a and b to be the vector (a1 + b1)i + (ae + be)j 
+ (a3 + b3)k. The difference a — b of the two vectors a and b is defined to 
be the vector (a1 — 51)i + (a2 — be)j + (a3 — ba)k. 


Because real numbers are commutative with respect to addition, it follows 
directly from this definition that the operation of vector addition is commuta- 
tive. That is we have a + b = b + a for all vectors a and b. When the sub- 
traction operation is considered the properties of real numbers imply the 
result a — b = —(b — a) for all vectors a and b. 


Example 419 If a=i+j+2k and b=3i— 3j+k, then a+b 
= (1+ 3i+ 0 —3j + (2+ Dk, showing that a+b = 4i — 2] + 3k. 
Reversal of the order of the sum followed by the same argument proves the 
commutative property a + b = b + a for these particular vectors. In the 
case of subtraction we have a — b = (1 — 3)i + (1 — (—3))j + 2 — Dk, 
showing that a — b = —2i + 4j,4+-k. It is easily established that a — b 
= —(b— 8). 


SEC 4-6 INTRODUCTION TO SPACE VECTORS / 143 


Although these particular results could be illustrated diagrammatically, 
the vector triangles involved would look essentially the same as those used 
earlier in connection with addition and subtraction of complex numbers and 
would be arrived at by the same reasoning. Rather than illustrate this specific 
case, we present in Fig. 4-10 the results of addition and subtraction of arbitrary 
vectors a and b. Because a geometrical projection method is necessary to 
illustrate three-dimensional problems on a sheet of paper, such diagrams are 
much less useful as a tool than was the case in a plane. Accordingly, we shall 
usually concentrate on an analytical approach to vectors, using diagrams 


Ζ 


ΕἸΡ. 410 Addition and subtraction of vectors. 


only when they seem likely to be helpful. 

Two terms worthy of note that are applied to vectors are the names 
parallel and anti-parallel. Two vectors will be said to be parallel when their 
lines of action are parallel and their senses are the same. Conversely, two 
vectors will be said to be anti-parallel when their lines of action are parallel 
but their senses are opposite. Thus if a is a vector and jis a scalar, the vectors 
a and wa are parallel if u > Ὁ and are anti-parallel if u < 0. It follows that 
two vectors will be parallel if their corresponding direction cosines are equal 
and they will be anti-parallel if their corresponding direction cosines are equal 
in magnitude but opposite in sign. 


Example 4:20 The vectors a = i+ 2] — 4k and b = 3i+ 6j — 12k are 
such that we may write 3a = b. Since the scalar 3 > 0 it follows that a and 
b are parallel. However the vectors ς = i — 3] + k ἀπά ἃ = —2i + 6j — 2k 
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Fig. 411 Position vectors defining the vector AB. 


are such that we may write —2c = d and, as the scalar —2 < 0, it follows 
that c and d are anti-parallel. By the same argument, the two vectors 
p = 3i — J + 2k and q = 6i + 2] + 4K are neither parallel nor anti-parallel, 
since for no scalar μὲ 15 it true that up = 4. 


The length of the vector AB which we have already denoted by |AB| 


is a useful quantity and, as with « complex numbers, is called the modulus of 
the vector AB. Its formal definition follows. 


DEFINITION 411 The modulus [4] of the vector a = αιἱ + aaj + ask is 
the positive square root 


It is an immediate consequence of this definition that any vector r with 
direction cosines {/, m, n} may be written in the form 


r= [r|\(@i+ mj + nk). (4-33) 


The proof of this is obvious for by definition, /|r| is the x-component of r, 
m|r| is the y-component, and n|r| is the z-component. The form of Eqn (4-33) 
shows that any vector may be expressed as the product ofa scalar (its modulus) 
and a unit vector defining its direction and sense. 
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When it is necessary to define an arbitrary vector AB in space, this may 
easily be accomplished by using position vectors a and b to identify its end 
points A and B. This is illustrated in Fig. 4-11 from which, by the rules of 
vector addition, we may write 


OA + AB = OB 
or, 


AB = OB — OA =b—a. 


Examination of this simple but useful result suggests that an accurate name 
for the vector AB would be the ‘position vector of B relative to A’, since in 


this role it is A that plays the part of the origin. This more exact name is 
seldom used since the symbol AB is sufficiently clear as it stands. 


Example 4-21 Let points A and B be identified by the position vectors 
a= —2i — 31] + k and b = 31 — j + 4k, respectively. Find the vector AB 
together with its modulus and direction cosines. 


Solution The diagram in Fig. 4-11 can be taken to represent this situation 
showing that vector AB = b — a. Substituting for the values of a and b, 


we find AB = (3i — j + 4k) — (—2i — 3j + k), whence AB = δὲ + 2j + 3k. 
Then [ΑΒ] = (52 + 22 + 32)1/2 = 4/38 after which the usual argument 
establishes that / = 5/4/38, m = 2/4/38, and n = 3/4/38. 


By considering the plane containing the vectors a, b, and b — a in Fig. 
4-11, the arguments that established the triangle inequalities for complex 
numbers also establish them for arbitrary space vectors. Hence for arbitrary 
vectors a and b we have 


ila] — [bl] < [a + 5] < [al + [9]. (4:34) 


Finally, to close this section, let us find the angle θ between two vectors 
a and b with the direction cosines {/, m1, m1} and {le, m2, n2}, respectively. 
When the lines of action of the vectors intersect the angle @ is well defined 
and, by convention, is always chosen to lie in the interval [0, 7]. If the lines 
of action of two vectors do not intersect then they are merely translated until 
they do, when the angle @ is defined as above. It will suffice to consider the 
angle between two unit vectors directed along a and b since the length of the 
vectors will obviously not influence the angle between them. From Eqn 
(4-33), these unit vectors are seen to be (/)i + mj + mk) and (Joi + moj + nok). 
These are shown in Fig. 4-12. They have their tips P and Q at the respective 
points (/1, 71, m1) and (lz, me, ne). 
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Fig. 412 Angle between two lines. 


Now, by the cosine rule 
ΙΡΟΙ] = |OP|? + |OQ|? — 2ΙΟΡΙ . [00] cos θ, (4°35) 


but [ΟΡ] = [00] = 1, and by Eqn (4-29), |PQ|? = (de — hh)? + (mz — mi)? 
+ (nz — πι)2, whilst by Eqn (4-28), [13 - mi? + πιῇ = Io? + me? + no? = 1. 
Consequently, substituting into Eqn (4-35) and simplifying, we find the 
desired result 


cos θ = [ile + mime + nine. (4-36) 


The angle of inclination @ follows directly from this equation. The restriction 
of the angle between the vectors to the interval [0, z] means that in Fig. 4-12, 
it is the angle @ that is selected, and not the angle θ΄. 

As a particular case, if 


le + myme + ning = 0, (4:37) 


then the two vectors a and b must be orthogonal. 


Example 4:22 Find the angle of inclination 6 between the vectors 
a =i-+ 2] + 3k and b = 2i —j — k. 


Solution Here [4] = 4/14, |b] = 4/6, so that the direction cosines {{1, m1, 11} 
of a are Jy = 1/4/14, mi = 2/4/14, πι = 3/4/14 whilst the direction cosines 
{lz, me, ne} of b are lp = 2/4/6, mz = —1/4/6, nz = —1/4/6. Hence by Eqn 
(4-36), the angle @ is the solution of the equation 


2 \9.(  Ξ a (2 (= 
/14)\/6 /14)\ 4/6 4/14) \ 4/6) 


cos § = 
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me 
24/21 
On account of the restriction of θ to the interval [0, =] it finally follows that 
θ = 1.905 rad. 


θ = arc cos [ 


4:7 Scalar and vector products 


If a = αιὶ + aej + aszk is an arbitrary vector and 4 is a scalar, then we have 
already defined the product ja to be the vector Aa = λαιῇ + λας] + Aask. 
Hence the effect of multiplying a vector by a scalar is to magnify the vector 
without changing its direction. The result of this product is to generate a 
vector. We must now discuss the multiplication of two vectors. 

Here three-dimensional vector algebra differs radically from the vector 
algebra of complex numbers. With complex numbers there is only one 
multiplication operation defined, and the product of two complex numbers is 
always a complex number. In the case of vectors we shall see that two multi- 
plication operations are defined for a pair of vectors. One operation called a 
scalar product generates a scalar, whereas the other operation called a vector 
product generates a vector. The operation of division is not defined for vectors. 

The scalar product of two vectors is a generalization of the notion of the 
orthogonal projection of a line element onto another line and is suggested by 
Eqn (4-36). Its definition follows. 


DEFINITION4 12 The scalar product of the two vectors a = aii + aaj + ask 
and b = ῥιΐ + bej + bsk is written a.b and is defined to be the scalar 
quantity 


a.b = aib1 + aebe + asbz. 


Because of the notation used, a scalar product is often colloquially 
called the dot product. Some books favour the notation (a, b) for the scalar 
product when it is then usually called the inner product of vectors a and b. 
To exhibit the relation of a . b to Eqn (4-36) we first divide a . b by the product 
of the moduli |a||b| to get 


a.b αιλί δι az \ [ be a3 \ { bs 
jot ~ (iat) ii) * (iat imi) τ (jar) (i) 
Then, from the definition of direction cosines, we recognize that this may be 
written 
a.b 
|a| |b] 


where {ἢ m1, m1} are the direction cosines of a and {/2, me, ng} are the direc- 


= Ilo + mimez + nino, (4:38) 
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tion cosines of b. If @ is the angle of inclination between a and b then, by 
virtue of Eqn (4-36), expression (4-38) becomes 


a. b = |aj(b| cos 0. (4:39) 
This may be taken as an alternative definition of the scalar product a. b. 


ALTERNATIVE DEFINITION 4:13 The scalar product of the two vectors 
a and b is written a. b and is defined to be the scalar quantity 


a.b = |al|b/ cos 6, 
where θ is the angle between the vectors. 


Notice that it is a direct consequence of the definition that the scalar 
product of two vectors is commutative. That is, we have a. b = b.a for any 
two vectors a and b. 

Because of this property we shall sometimes, and without confusion, 
write a? with the understanding that a2 = 8. ἃ. In practice Definition 4-12 
is most used to find the scalar product since it relates the scalar product 
directly to the components of the vectors involved. The alternative form set 
out in Definition 4-13 is used to find the angle between the two vectors once 
the scalar product is known. 


Example 4.232 Find the scalar product of the vectors a = —2i — 3] +k 
and b = —i + j + 3k and use the result to find the angle between a and b. 


Solution From Definition 4:12 we have a.b = (—2\(—1) + (—3\(1) 
+ (1)(3) = 2. Now [8] = 1/14 and |b] = ν 11, so that substituting in 
Definition 4:13 we have 2 = 1/14. 4/11 cos θ and hence cos 6 = 2/4/154, 
or θ = arc cos (2/4/154). 


Consider the scalar products of the unit vectors i, j, and k. Since these are 
mutually orthogonal the angle between any two is 7/2. It follows from 
Definition 4-13 that the scalar product of any two different unit vectors from 
this triad is zero. As each of the vectors i, j, and k is parallel to itself, when 
forming the scalar product of one of these vectors with itself we must set 
6 =0. Thus as |i] = |j] = [Κ| = 1, it follows from Definition 4-13 that 
i.i=j.j =k.k = 1. In summary we have these important results, which 
should be memorized since they are fundamental to everything that follows: 


ii=j.jok.k =], 
i.j=j.i=0, 
i.k=k.i=0, 
j.k=k.j=0. 


These results are conveniently combined in Table 4-2. Each entry is to be 
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interpreted as the scalar product of the vector at the left of the row of the 
entry, with the vector at the top of the column of the entry. 


Table 42 Table of scalar products of i, j, and Κα 


First Second member 
member i j k 
i 1 0 0 
j 0 1 0 
k 0 I 


The scalar product of two vectors may be deduced using Table 4:2 by 
simple algebraic manipulation without the use of Definition 4-12. To see this 
consider the vectors a = aii + doj + ask and b = διὶ + δε] + b3k. First 
form their scalar product 


a.b= (αιἱ + a2} + ask). (b1i + θη] + bsk), 


and then expand the right-hand side as though ordinary algebraic quantities 
were involved to obtain 


a.b = (aii). (d1i) + (aii) . (ej) + (aii) . (03k) + (Gj) . (δι᾿) 
+ (aaj) . (aj) + (aej) . (63k) + (ask) . (b1i) + (ask) . (δε) 
+ (ask) . (bok). 


Next, recognizing that the scalars a;, b; may be taken to the front of each 
scalar product involved, rewrite the result thus: 


ἃ. Ὁ = aybii.i + ayboi .j + aib3i. k + aebij.i + aeboj. j 
+ aebsj.k + agbik .i + asbek .j + az3b3k . k. 

Finally, using Table 4-2, this reduces to the desired result 

a.b = αιδὶ + adobe + agbs. 
In practice the intermediate working is always omitted and the result of a 
scalar product is written on sight by retaining only the products involving 
i.i,j.j,andk.k. 
Example 4:24 Determine the scalar products of these pairs of vectors: 

(a) a=i-—3j+k, b= —i+j — 3k; 

(b)a=2i+j—k, b= —i+j-—k; 

(c) a = 2i —j + 3k, b = —2i + j — 3k; 

(d)a=i+2j—k, b=i+2j—k. : 
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Solutions To show the application of scalar products of unit vectors we shall 
retain the notation i. i, j. j, and k . k in the first part of each calculation to 
indicate the origin of the terms involved. The terms involving products such 
asi.j,i.k,. . ., will be omitted as these scalar products are zero. The result 
will usually be written down on sight without any intermediate working. 
(a) a.b=(i— 3] +k). (-—i+j -- 3k) 
= (1(—Di.i + (—3)(Dj.j + ()--3κ.κ 


(b) a.b = (29+ j —k).(-i+j—k) 

(2)(—1)i. i + (Dj. j§ + (-1(-Dk . k 
=-—-2+1+1=0. 

Thus a and b are orthogonal. 

(c) ἃ." = (2i — j + 3k). (—2i + j — 3k) 

(2)(—2)i .1 + (~D()j.j + )--3κ.κ 
Ξε de 1 Oe -- 14, 

(ὦ a.b=(i+ 2] —k).(@i+ 2j -- 
= (1)0)i.i1 +  ()Ω}} .} + (—1)(-—Dk. k 
ΞΞὄ 7 Ὁ 4 - 1 -Ξ 6. 


Example (4) above is a special case of the scalar product of a vector with 
itself and either from Definition 4-12 or 4-13 we see that for an arbitrary 
vector a, 


ἃ. ἃ = |al?. (4-40) 


In words, ‘the scalar product of a vector with itself is equal to the square of 
the modulus of that vector’. This simple result is often valuable when finding 
a unit vector parallel to a given arbitrary vector a. To see how this comes 
about, if we divide a by its modulus |«| to form the vector ἃ = a/|a|, then 
result (4-40) shows that ἃ. & = 1 and so ἃ is a unit vector. 


Example 4.25 Find a unit vector ἃ parallel to the vector @ = 3i — j — 2k. 
Use the result to determine the projection of the vector b = 2i + 3j + k in 
the direction of a. 


Solution Here |a| = ν14 so that the desired unit vector ἃ = αν 14 
= (3/-/14)i — (1//14)j — (2/+/14k. Now the projection of vector b along 
a is by definition the length / of vector b when projected normally onto the 
line determined by a. Thus it is / = [8] cos 6, where θ is the angle between 
b and α. Since [ἃ] = 1 we may write this as / = |b||@| cos 6 or, by Definition 
4-13, 85 ἰ = b. ἃ. Hence in this problem / = (2i + 3] +k). ἃ = 1/4/14. 
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It follows from the definition of a scalar product of two vectors and from 
the properties of real numbers, that if a, b, and ¢ are three arbitrary vectors, 
then 


a.(b+cec)=a.b+a.c. 


This is the distributive law for the scalar product of vectors. 

Expressions of the form a.b.c,a.b.c.d,.. ., are meaningless since 
the scalar product is only defined between a pair of vectors. Note also that 
division by vectors is not defined, since although we may write a. b = ἡ, it 
makes no sense to write either a = n/.b or a. = n/b. 

The other form of product of two vectors is the vector product. We shall 
denote the vector product of vectors a and b by a x b. Again because of the 
notation this is often colloquially called the cross product of two vectors. 
Other notations in use for the vector product are [a, b] anda Λ b. In prepara- 
tion for the definition of a Χ b we now introduce a unit vector fi that is 
normal (i.e. orthogonal) to the plane defined by the vectors a and b, and 
whose sense is such that a, b, and ἢ, in this order, form a right-handed set of 
vectors. Here, although a, b, and fiare not necessarily mutually orthogonal, 
we use right-handedness exactly as was defined at the start of Section 4-6. 


DEFINITION 4:14 The vector product of vectors a and b will be written 
a x b and is defined to be the vector quantity 


a ΧΡ = |a||b| sin θῇ, 


where @ is the angle between vectors a and b with sin 0 > 0, and ἢ is a unit 
vector normal to the plane of a and b such that a, b, and ἢ, in this order, form 
a right-handed set of vectors. 


This shows that the vector a x b 15 normal to both a and b and has 
magnitude |a||b| sin θ. The first interesting and unusual feature of this form 
of product is that it is not commutative. If a, b, ἢ, in this order, form a right- 
handed set for the definition of a x b, then for the definition of b x a it is 
necessary to take for the right-handed set the vectors b, a, —fi, in the stated 
order. The immediate consequence is the important general result that if 
a and b are arbitrary vectors, then 


a x b= —(b X a). (4-41) 


In contrast with the scalar product, it 1s easily seen that the vector product 
of parallel vectors is identically zero, whereas the vector product of orthogonal 
vectors is non-zero. A simple calculation gives Table 4-3 of vector products 
of the unit vectors i, j, and k. The left-hand column identifies the first member 
of the vector product and the top row identifies the second member of the 
vector product. The corresponding entry in the table gives the result of the 
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vector product. The entries along the diagonal are all seen to be the zero or 
null vector. 


Table 4.3 Table of vector products of i, j, and k 


First Second member 

member i j k 
i 0 k —j 
Ϊ —k 0 i 
k Ϊ --ἰ 0 


If we take, for example, the first element in the left-hand column and the 
last element in the top row, we see that i x k = —j. In many respects it is 
easier to memorize these three results: 


ix j=k, jx k=1, k xX 1=j, (4-42) 


and then to use property (4-41), than to remember Table 4-3 complete. The 
order of the vectors occurring in these key relations can be remembered by 
making the cyclic permutations 


ij k 
j ki 
k i j 


As with scalar products, this table of vector products may be used to 
calculate the vector product of any two vectors expressed in component form. 
Consider the vector product a xb where a=aqji+aoj+ ask and 

= bii + δε] + bsk. Proceeding as though ordinary algebraic quantities 
were involved we write 


a Χ b = (aii + σοὶ + ask) x (hii + δε] + b3k) 


= (aii) x (11) + (aii) X (b2j) + (aii) x (31) 
+ (a2j) x (δι) + (aaj) x (b2j) + Gj) x (bsk) 
+- (ask) Χ (b1i) + (ask) x (b2j) + (ask) x (63k), 


working on the assumption that vector multiplication is distributive over 
addition. Next we recognize that the scalars aj, 5; may be taken out in front 
of each vector product that is involved so that the expression becomes 
aX b= αιδιὶ X 1+ αἰδοῖ X j + aibsi X k + aebij x i + aebaj x j 
+ a2bsj Xx k + aghbik x 1+ agbek x j + asb3k x k. 
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Finally, using Table 4-3 and collecting together the i, j, and k terms, we 
obtain 


axb= (a2b3 = a3b2)i + (a3bh1 — aib3)j + (aib2 = d2b1)k. (4-43) 


This is often taken as the definition of the vector product a x b in place 
of our Definition 4-14. Expression (4-43) may be considerably simplified if 
the concept of a determinant is used. Before showing this we must digress 
slightly to define this term. 


DEFINITION 4:15 Let a, δ, c, and d be any four real numbers. Consider 
the two-row by two-column array of these numbers 


a b 
c ad. i 
Define the expression 
a b 
50d (B) 
that is associated with this array by the identity 
a 
p == (ad — cb). (C) 


We define the second-order determinant associated with the array (A) to be 
the number represented in symbols by (B) and having the value defined by 
(C). The process of expressing the left-hand side of (C) in the form of the 
right-hand side is called expanding the determinant. 
Example 4:26 Evaluate the second-order determinants 

(a)| 1 7 (0) 10 —1 
3.9 4 δ᾽ 


2 6 
5) 


(c) 


9 


Solution ‘The values of the determinants follow directly from the definition: 


(a) | 1 πὰ 

ἦν οὖ = (1)0) -- (5)0) = 9 -- 21 = -- 12; 
ΞΕ en ee 

1 {7 Οχῶ -- (ὡ(-) =044=4; 
(c) 


6 
; = 00 — (1I)(6) ΞΞ 6 -- 6 ΞΞ Ο. 
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DEFINITION 4:16 Let a, δι, and c; with i= 1, 2, 3 be any set of nine 
real numbers. Consider the three-row by three-column array of these numbers 


αι a2 ag 
bi be bg (A) 
C1 ¢€2 C3. 
Define the expression 
ai az ag 
δι δὲ bs (8) 
C1 C2 C€8 


that is associated with this array to be the single number that is determined by 
the identity 


“1. 42 a3 |. 
bo bg bi bg by be 

δι δὲ. b3 | =a — a + a3 : (C) 
C2. «C83 C1 C3 Ci 2 

C1 C2 C3 


We define the third-order determinant associated with the array (A) to be the 
number represented in symbols by (B) and having the value defined by (C). 


Example 4-27 Evaluate the third order determinant 


23. ΞΞ2 357 
A=|2 1 2 
2 1 1 


Solution From the definition, 


50: eg ; 
2 1 2/=(@) ΞΡ [te 
2 1 1 


Expanding the three second-order determinants and adding, we obtain the 
desired result 


N= 30 -- 2) -- 22 -- 4) = 7 = 2) S57. 


It is helpful to classify determinants in some simple way, which the 


next definition achieves. ; 


DEFINITION 4:17 We define the order of a determinant to be the number 
of terms that lie on a diagonal drawn from the top left-hand corner to the 
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bottom right-hand corner. The values of these terms are immaterial. 


Thus in Example 4:26 the determinants are second-order, whereas in 
Example 4:27 the determinant 15 third-order, and is evaluated in terms of 
three second-order determinants. 

We are now able to give the promised alternative definition of a vector 
product. 


ALTERNATIVE DEFINITION 4:18 We define the vector product a Χ b of 
the two vectors a = αι + d2j + ask and Ὁ = di + δε) + bsk to be the 


formal expansion of the determinant 


ij k 
aX b= |a1 az a3 il. 


bi be bs 


In this definition we have used the word ‘formal’ because, although the a; 
and δὲ are real numbers, the i, j, and k are unit vectors. Aside from this the 
expansion of the third-order determinant is performed exactly as in Example 
4-27. 


Example 4.28 Determine the vector product a x b where a = i + j — 2k 
and b = —2i + 3j + k. 


Solution To apply Definition 4-18 we first notice that the components 
a1, a2, and ag of a are 1, 1, and —2 whilst the components δ, be, and b3 of b 
are —2, 3, and 1. Hence | 


ij κα 
1 —2 1 —2 1 1 
axb= I 1 -2|Ξὶ —j +k 
3 Ϊ —2 Ξ 2. Ὁ 
—2 3 ] 


and so 
ax b= 7i+ 3j + 5k. 


This effectively demonstrates that for most practical purposes Definition 
4-18 involves the least manipulation. 

It is easily proved that the vector product is distributive, so that for any 
three vectors a, b, and ς we always have 


ax(b+c)=axb+axe. 


Indeed this is implied by the way in which Eqn (4-43) was derived. 
With the introduction of the vector product, mixed products of the form 
ἃ. (Ὁ x c) become possible. This type of product is known as a triple scalar 
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product and as it involves the scalar product of a with (Ὁ x c) it is seen to be 
a scalar. If a = ai + aoj + ask, b = διὶ + δε] + bk, and ς = cii 4+ coj 
+ c3k then by combination of Definitions 4-12 and 4-18 we have 


ij k 

ἃ. (Ὁ Χ c) = (mi + aej + ask).| δι δὰ bg 

οι C2 (8 
ΟΥ̓, 
ἃ. (Ὁ Χ c) = ai(bec3 — (508) --- ae(bics — c1b3) + as(bice — crbe). 


The terms on the right-hand side of this expression are the result of expanding 
(C) in Definition 4-16, so that they may be re-combined into a determinant to 
give the general result 


αι a2 a3 
a. (b Xc)=| bi be bs |. (4-44) 
C1 cg C83 


By interchanging rows of the determinant it is readily shown that the 
dot . and the cross x in a triple scalar product may be interchanged so that 


a.(b X c) = (a x b).¢. (4-45) 


Example 4:29 Evaluate the triple scalar product a.(b x c) given that 
a= 2i+ k,b=i+j + 2k, ande = —i + j. 


Solution The components of a, b, and ς are, respectively, (2, 0, 1), (1, 1, 2), | 
and (—1, 1, 0). Hence | 


201 
‘Sbeo =|) 1 ἢ 21S 0 (0) 2020) S160) a0 
~1 1 0 


As our next generalization, we notice that vector products of more than 
two vectors are defined provided the order in which these products are to be 
carried out is specified by bracketing. As a special case we have the triple 
vector product a x (b Χ c) of the three vectors a, b, and ς which differs 
from the triple vector product (a x b) x c. The first expression signifies the 
vector product of a and (Ὁ x ο), whilst the second signifies the vector product 
of (a x b) and ¢, and in general these are different vectors. 

A straightforward application of Definition 4-18 establishes the following 
useful identity from which some interesting results may be derived 


a x (b x c) = (a. c)b — (a. bide. (4-46) 
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The details of the proof are left to the reader. 


Example 4:30 Demonstrate the difference between the triple vector products 
a x (b x οὐ) and (a x b) x ς by making the identifications a = i,b =i + J, 
σξκ. 


Solution By direct substitution we find thata x (Ὁ x c) =i x [i+ j) x k] 
and so expanding this result by using Eqn (4:42) gives a x (Ὁ Χ c) 
=i x [-- ἢ = —k. Similarly, in the second case, (a x b)xe¢ = 
fixG+p]lxk=kxk=0. 


4.8 Geometrical applications 


This section illustrates something of the application of vectors to elementary 
geometry, and gives some simple but useful results. First we consider the 
representation of a straight line in vector form, and then show how the single 
vector equation may be reduced to the more familiar set of three Cartesian 
equations. 


The straight line 


Consider the problem of determining the equation of a straight line given that 
it passes through the point A with position vector a relative to O, and is 
parallel to vector b. We shall denote the position vector of a general point P 
on the line by r as shown in Fig. 4-13. 


0 
Fig. 4.13 Straight line through A parallel to b. 


By the rules of vector addition we have 
OP = OA + AP 

or, 
r=a-+ AP. 


However, as the straight line through A is parallel to the free vector b, 
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it follows that for any point P on the line there is a scalar A such that we can 
write AP = Ab. Applying this result to the equation above we see that the 
vector equation for the straight line becomes 


r=a-+ Ab. (4-47) 

The scalar 4 in this equation is simply a parameter, and different values of 
A will determine different points on the line. To express this result in Cartesian 
form, set r = xi + yj + zk, a = aii + ἀεὶ + ask and b = byi + δε] + dak, 
when Eqn (4-47) reduces to 

xi + yj + zk = αιἱ Ἑ a2} + ask + A(bii a ῥα) -Ε b3k). 
This vector equation implies three scalar equations by virtue of the equality 
of its i, j, and k components. Hence we arrive at the three scalar equations 

xX =a, + Abi (i-component) 

y =a2+ Abe (j-component) 

Z = a3 + Ab3_ (k-component), 


If these are each solved for A and equated, we obtain the more familiar result 


χ-παι y—@ zZ—43 
= = —_ =, 4-48 
by be b3 ee) 


Equations (4:48) are the standard Cartesian form for the equations of a 
straight line. Notice that the coefficients of x, y, and z in Eqn (4-48) are all 
unity; that δι, be, and 53 are then the direction ratios of b and a1, ae, and as 
define a point on the line. Equations (4-48) are sometimes expressed in the 
form of three simultaneous equations relating x and y, x and z, and y and z. 
This follows by cross-multiplying different pairs of expressions in Eqn (4-48). 


Example 4.31 Find the vector equation of the line through the point with 
position vector i+ 3] — k which is parallel to the vector 2i + 3] + 4k. 
Determine the point on the line corresponding to 4 = 2 in the resulting 
equation. Also express the vector equation of the line in standard Cartesian 
form. 


Solution From Eqn (4-47) we have 

r= (i + 3] — k) + AQi + 3] + 4k) 
or, 

r= (1 + 2A)i + 301 + Aj t+ (44 — Dk. 


This is the vector equation of the line, and setting 2 = 2 determines the 
point r = 5i + 9] + 7k. To express the equation of the line in Cartesian 
form we appeal to Eqns (4-48) and use the fact that a= i+ 3] —k and 
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b = 21] + 3j + 4k. Hence a1 = 1, ag = 3, 483 = —1, and δι = 2, be = 3, 
and b3 = 4, so that the desired Cartesian equations are 


x—- 1 i ae es 


a. 4 


As a check we can also use these equations to determine the point corres- 
ponding to A = 2. We must solve the three equations 
oe y-3 ZI 
Ἐπ ς“ “--2, 
3 4 


which give x = 5, y = 9, and z = 7. These are of course the coordinates of 
the tip of the position vector r = 5i + 9] + 7k which confirms our previous 
result. 


The same approach may be used if the line is required to pass through the 
two points A and B with position vectors « and 8, respectively. For then the 
line passes through @ and is parallel to the vector B — α which is just a seg- 
ment of the line itself. Hence we identify a with ἃ and b with B — a, after 
which the argument proceeds as before. 

In the next example we illustrate how the non-standard Cartesian equa- 
tions of a straight line may be re-interpreted in vector form. 


Example 4.32 The equations 


2x—1 yt? —z+4 
2 i. Ππ 52 


determine a straight line. Express them in vector form and find the direction 
ratios of the line. 


Solution To express the equations in standard Cartesian form we must 
first make the coefficients of x, y, and z each equal to unity. Hence we rewrite 
the equations: 


X-3 γὲ2 2-4 
(3/2) 3.ὃ ς(ς2 


The vector a then has components a; = 3, ag = —2, a3 = 4 and the vector 
b has the components 6; = 3/2, b2 = 3, b3 = —2. These latter three numbers 
are the desired direction ratios. The vector equation of the straight line itself is 


r= i(1 + 3Ai + (34 — 2)j + 222 — Dk. 
(Why ?) 


On occasion it is necessary to determine the perpendicular distance p 
from a point C with position vector ς to the line L with equationr = a + Ab. 
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Fig. 414 Perpendicular distance of point from line. 


This can be done by applying Pythagoras’ theorem in Fig. 4-14. 

We have the obvious result 

p? = (AC)? — (AB)? 
but AC τες — a so that (AC)? = |AC/? = (c — a). (c — a), whilst length 
AB is the projection of AC onto the line L. Now the unit vector along L is 
b/|b| so that AB = (ς — a). b/|b| and thus 


(AB)? = (Ξ πῆς . 


Combining these results gives 


-- 4). ΒᾺ3 
ΓΙ (SF) | (4-49) 
from which p may be deduced. 


Example 4.33 Find the distance of the point with position vector i + j + k 
from the line r = (i + 2] + k) + λᾷ -- 2] +k). 


Solution In the notation leading to Eqn (4-49) we have a=i+ 2] +k, 
b =i — 2] +k, and c=i+j+k. Hence c — a = -j and thus (ς — a) 
. (ς — a) = (—j).(-j) =1. Also (ς —a).b = --ἰ. ( -- 2] +k) =2 so 
that ((ς — a). b)? = 4, whilst |b|? = 6. Hence 


(c—a).b\2 4 2 
a) 


(b| “6.3 
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0 
Fig. 4.15 Vector equation of a plane n. r = [π|ρ. 


and so from Eqn (4-49), ρ5 -- 1 -- ὃ τιὸΗ αὶ or p = 1/3 as p is essentially 
positive. 


The plane 


The equation of a plane is easily determined once it is recognized that a 
plane II is specified when one point on it is known, together with any vector 
perpendicular to it. Such a vector, when normalized, is a unit-normal to the 
plane II and is unique except for its sign. The ambiguity as to the sign of the 
normal is, of course, because a plane has no preferred side. To derive its 
equation consider Fig. 4-15. 

Let r be the position vector relative to O of a point P on the plane I, and 
n be a vector normal to the plane directed through the plane away from O 
so that the corresponding unit normal is ἢ = n/ in|. Further, let the perpendi- 
cular distance ON from the origin O to the plane be p. Then for all points P 
we have (OP) cos 6 = p. In terms of vectors this is 


-- =p, (4.50) 


which is just the vector equation of a plane. If the number Pp in Egn (4-50) is 
positive then the plane lies on the side of the ori gin towards which nis directed, 
otherwise it lies on the opposite side. , 

To express result (4:50) in Cartesian form let r = xi + y] + ΖΚ and the 
unit normal ἢ = n/|n| = /i + mj + nk, where of course /2 + m2 + πὸ = 1, 
Equation (4-50) becomes 


Ix + my + nz = p. (4-51) 
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This is the standard Cartesian form of the equation of a plane. Any equation 
of this form represents a plane having for its unit normal the vector /i + mj 
+ nk and lying at a perpendicular distance p from the origin. If p = 0 the 
plane passes through the origin. 


Example 4.34 Find the Cartesian equation of the plane containing the point 
(1, 2, 3) which is normal to the vector i + 2j + 2k. 


Solution First we use Eqn (4-50) to determine p. Since the point (1, 2, 3) 

lies in the plane, r = i + 2] + 3k is the position vector of a point in the plane. 

The vector normal to the plane in this case is nm = i+ 2j + 2k, so that 

in| = 3 and the unit normal ἢ = n/|n| = (i + 2j + 2k)/3. This shows that 
= 3,m = ξ, ἡ = ὃ. Hence, substituting into Eqn (4-50), 


,-- G+ 310. + 3j + 21) 
5: Ba dd 


or p = 11/3. As p > 0, the plane must lie on the side of the origin towards 
which ἢ is directed. Substituting in Eqn (4-51) we find the desired Cartesian 
form of the equation of the plane: 


11 
ἐκ τ fy Ἐπ 


This equation could equally well be written in the non-standard Cartesian 
form x + 2y + 2z = 11, though then the constant on the right-hand side is 
no longer the perpendicular distance of the plane from the origin. 


Simple geometrical considerations similar to those set out above, when 
coupled with the scalar and vector product, enable various useful results to 
be derived very quickly. For example, as the angle θ between two planes is 
defined to be the angle between their unit normals ἢ: and fig it follows that 
θ may be obtained from the scalar product fi; . fig = cos θ. Also the line of 
intersection of these two planes is perpendicular to both normals fy and fiz 
and so is parallel to the vector t determined by the vector product t = fi x fis. 
Rather than elaborate on these ideas here, a number of problems are given 
at the end of the chapter. 


The sphere 
Consider a sphere of radius R with its centre at the point A with the position 
vector a. Then if r is the position vector of any point on the surface of the 
sphere, the modulus of the vector r — a must equal R. In terms of vectors the 
equation of the sphere is 

lIr—a|=R 


or, alternatively, 
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(r — a). (Ὁ — a) = R?. (4:52) 
If, now, we expand this equation to get 
r.r—2r.a= ἢ --- ἃ. 8. 


and then set r = xi+ yj + zk, a = ai + (οἱ + ask and R2—a.a=g, 
we obtain the standard Cartesian form of the equation of a sphere 


x? + γδ + 22 — 2aix — 2ary — 2asz = (. (4-53) 


Example 4:35 Find the Cartesian form of equation of the sphere of radius 2 
having its centre ata = i+ j + 2k. 


Solution As r= xi+yj+zk and a=i+j+2k we have r—a 
= (x — li+ (vy — Dj + @ — Qk, whilst R = 2. Hence Eqn (4-52) becomes 


(x — 1)? + (y — 1? + ὦ — 2)? = 4, 


which is the desired Cartesian form of the equation. 


4.9 Applications to mechanics 


This section briefly introduces some of the many situations in mechanics 
that are best described vectorially. First is one of the simplest applications 
of vectors, that will already be familiar to the reader. 


Polygon of forces—resultant 


It is known from experiment that when forces Fi, Fo,. . ., Fy act ona rigid 
body through a single point O, their combined effect is equivalent to that of a 
single force R, their resultant, which acts through the same point O and is 
equal to their vector sum. Such a system of forces acting through a single 
point is a concurrent system of forces. Thus we have 


R=F,+Fo+---+ Fy. (4-54) 


These forces are often represented in the form of a vector polygon of 
forces as shown in Fig. 4-16, in which the senses of the forces F; are all simi- 
larly directed and are opposite to the sense of R. 

Conversely, the vector polygon shows that the vector —R is the additional 
force that is required to act through O in order to maintain the system Οἱ 
forces in equilibrium. 


Example 4.36 Forcés Εἰ, Fe, and Fs have magnitudes 34/3, 4/14, and 24/6 lb 
and act concurrently through a point O along the lines of the vector 
i+j+k, 3i—j+ 2k, and ~i+ 2j+k, respectively. Find force Q tha 
must act through O for the system to remain in equilibrium. 
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Fig. 416 Vector polygon. 


Solution This is a direct application of the last remark about the vector 
polygon of forces, and the only problem is one of scaling. Let us agree that a 
vector of unit modulus represents a force of 1 lb. From the conditions of the 
question we see that Fi, Fe, and F3 are respectively directed along the unit 
vectors 


fi = a+i+h, 
f= Gi- } Ὁ 2, 
f= (-14 3+, 


Using the scale factor we can use these to write 
Fi = 3,31 = 31 + 3] + 3k, 
Fo = ν 1472 = 3i — j + 2k, 
F3 = 24/6f3 = —2i + 4] + 2k. 


Hence the resultant R = Εἰ + Fe + Fs = 4i + 6j + 7k. The force necessary 
for equilibrium is Q = —R showing that Q = —4i — 6j — 7k. 

As |Q| = 101, it follows immediately that the desired force is 1/101 Ibs 
and acts in the direction of the unit vector ἢ, where 


ἃ = a (4i + 6j + 7k). 


In many problems of statics the centroid or the centre of mass of a system 
of particles is of importance. We now define this concept in terms of vectors. 
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DEFINITION 4-19 The centre of mass of the system of masses 111, m2,.. ., 
My Whose position vectors are a1, az, . . ., ay is at the point G, where G 
has the position vector g determined by 


mya, + meas + τ" + mya 
my + 1. - "τ" - My 


Next we discuss simple problems about relative motions, and relative 
velocity. 


Relative velocity 

Problems involving the motion of one point relative to another, which is 
itself moving, occur frequently in mechanics and easily lend themselves to 
vector treatment. They are best illustrated by example but first we define 
relative velocity. 


DEFINITION 4:20 The relative velocity of a point P with velocity u, relative 
to the point Q with velocity v, is defined to be the velocity u — νυ. 


᾿ Example 4.37 A man walks due east at 4 mile/h and his dog runs north- 


east at 12 mile/h. Find the velocity and speed of the man relative to his dog. 


Solution Let a unit vector denote a velocity of magnitude | mile/h and take 
} pointing due north and i pointing due east. 

Unit vectors in the directions of motion of the man and dog are then i and 
(i + j)//2. The velocity u of the man is thus u = 4i and the velocity v of the 
dog is v = 64/2(i + j). Hence the velocity of the man relative to his dog is 


u— Vv = 2(2 — 34/2)i — ἀν]. 
His relative speed is Ju — y| = (160 — 484/2)!/2 mile/h. 


Work done by a force 


The scalar product can be used to give a convenient representation of the 

work W done by a force F that produces a displacement d of the particle on 

which it acts. The work done by a force of magnitude {F| when it displaces a 

particle through a distance [4] is defined as the product of the distance 

moved and the component of force in the direction of the displacement. 
Hence, as W Is positive we have 


W = |F||d|[cos 6], 
where @ is the angle of inclination between F and d. So the final result is: 


W = |F. dl. (4-55) 


Example 4:38 Calculate the work W done by a force F of 12 Ibs whose line 
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of action is parallel to 2i + 3] — 2k when it moves its point of application 
through a displacement d of 4 ft in a direction parallel to —2i + j — 3k. 


Solution The unit vectors parallel to the force F and displacement d are 
f = (2i + 3] — 2Κ)ν 17 and ἃ = (—2i + j — 3k)/1/14, respectively. Let ἢ 
denote a force of 110 and ἃ a displacement of 1 ft so that F = 12f 
= (241 + 36j — 24k)//17 and d = 4d = (—8i + 4] — 12k)/4/14. Then 
the work W that is done is 


= (24)(—8) + (36)(4) + (—24)(—12) = 240 ft Ibs. 


We now turn to applications of the vector product. One of the easiest 
occurs in the determination of the angular velocity of a point rotating about a 
fixed axis. 


Angular velocity 


Consider a rigid body rotating with a constant spin 9 rad/s about a fixed 
axis L. Fig. 4-17 represents a point P in such a body, having the position vector 
d relative to a point O on the spin axis L. Point Q is the foot of the perpendi- 
cular from P to the line L. 

The vector “δ parallel to L with magnitude Q and sense determined by a 
right-hand screw rule with respect to L and the direction of the spin Q is 
called the angular velocity of the body. The instantaneous linear velocity v 
of point P with position vector d is obviously Q . (QP) in a direction tangent 
to the dotted circle in Fig. 4-17. It is easily seen that we may rewrite this as 


|v| = [Ὁ |ἀ] sin 0 
or as 
y= Q2 xd. (4:56) 


The final two applications of the vector product involve the concept of the 
moment of a vector which Is first defined and they require the use of a bound 
vector. 


DEFINITION 4:21 We define M =d x Q to be the moment of vector Q 
about the point O, where d is the position vector relative to O of any point on 
the line of action of the bound vector Q. 


This definition is illustrated in Fig. 4-18 in which the plane II contains the 
vectors d and Q and, by virtue of the definition of the moment, M is normal 
to If. 

The natural mechanical applications of this definition are to the moment 
of a force and to the moment of momentum about a fixed point. In both 


was 
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Fig. 417 Angular velocity. Fig. 4:18 Moment of a vector about O. 


situations the line of action of the vector whose moment is to be found is 
important, as is its point of application in some circumstances. 
If Q is identified with a force F, then the expression 


M=dxF (4-57) 


is the moment or torque of the force F about O. If the force is expressed in 
Ib and the displacement vector in ft, the units of torque are lb-ft. Similarly, 
if Q is identified with the momentum mv of a particle of mass m moving with 
velocity v, then the vector 


M = d x (mv) 
= md X V (4-58) 


is the moment of momentum or the angular momentum of the particle about O. 


PROBLEMS 
Section 4-1 


4:1 Give a graphical representation of each of the following velocities by drawing 
directed line elements. In each case indicate the sense of the vector with an 
arrow: 


(a) 4 ft/s in a north-east direction; 
(b) 2-5 ft/s in a south-west direction; 
(c) 5 ft/s due west. | 


What velocities would these same directed line elements represent if the 
arrows were reversed ? 


4:2 Classify each of these quantities as scalar or vector: 
(a) volume; (Ὁ) length; (c) momentum = mass x velocity; (d) electric 
field; (e) speed; (f) acceleration; (g) density; (h) chemical concentration; 
(i) electrostatic capacity; (j) moment of a force. 
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4:3 Find the roots of each equation: 
(a) x2 = —36; (0) x? = —27; (c) x2 = 25; (d) x? = —2. 


4-4 Find the roots of these quadratic equations: | | 
(a) χ +3x+3=0; (Ὁ) x? -—3x4+2=0; (c) 2? +4%45=0. 
4.5 By setting x* = w, reduce the following quartic equations to quadratic 
equations, and hence obtain their roots: 
(a) x4+x2—2=0; (0) x44 5x°+6=0; (c) x4 — 5x*°4+6=0. 


4.6 Find the real and imaginary parts of each of these complex numbers: 
(a) 2 Ξ59 -- δἰ; (Ὁ) 2- 32; (Ὁ z= 1442); (4) z=17i; (0) 2Ξ 
-3 +i. 

4.7 Write the following numbers in real-imaginary form given that their real 
and imaginary parts are: 


(a) Rez = —ll,Imz=1; (Ὁ) Rez=0,Imz= —3; 
(c) Rez=0,Imz=0; (4) Rez=4,Imz= 17. 
Section 4:2 


4:8 Which of these complex numbers are equal? 


1=2-—i,z=1—i,z2=4+i,z74=1—i,z35=2+i,7=2—i, 
Ζη τ 1 --ἰ. 


4.9 Given that the following complex numbers are equal, deduce the values of 


aand b: 
(a) 2 — 3) = 2 + id; (b) a+ 4i=1 + ib; 
(c) 3+ 7i=a-+ ib; (4) 5+ ia=b + δὶ. 


4:10 Use Definitions 4-2 and 4-3 together with the real number axioms to prove 
that (a) ΖΙ + z2 = z2 + Ζι thereby showing that complex addition is com- 
mutative and, (Ὁ) z1 — z2 = —(ze — 21). 


4-11 Form the sums Ζι + ze given that: 


(a) Ζι =3—i,z2=44 Τὶ: 
(Ὁ) Ζι = —2 — 4i, ze = 2 + 3): 
(c) Ζι = 5+ δὲ, z2 = —5 — δὲ; 
(4) z1 = 4 — 3], ze = 2 + 3i. 


4:12 Form the differences Ζι — z2 given that: 


(a) Ζι = 2+ δὲ, z2 = 4 + 2]: 
(Ὁ) Ζι = —2 + i, z2 = —2 + 2i; 
(c) Ζι = 4+ 7i, z2 = 2+ 7; 


(d) Ζι = 34, z2 = 1 + 3]. 


4:13 Form the products z1z2 given that: 
(a) zz =1+i,2z2 =2+4 3); 
(Ὁ) z1 = 3 — δὶ, z2 = 3 4+ Si; 
(c) Ζι = i, 22 = 4 -- 3]; 
(d) zi = 2,72 = 9 —i. 
4:14 Evaluate (1 + 7)? — (1 — J). 


I 


I 
4:15 Evaluate Gait (-- ἡ 
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4:16 Solve these equations for z: 

(a) 32 + (9 + 61) = 7 + 3i; (Ὁ) 2z + (3 — 2) = 3 — 2); 

(c) 42 — (4+ 6/) = --3 +7; (d) 3z+(2+/) ~304+2)=1 +i. 
4:17 Form the quotients z1/z2 given that: 

(a) Ζι = 34+ 2], ζὸ Ξε 1] -- ἰ; (0) Ζι = 9+ 3i,z2 = 3 +i; 

(0) Ζι = 8 + δ, zo = 2 — 41. 
4:18 Solve these equations for z: 

(a) 2273 +i) =2 4 3); (b) 32Ζ([ — 2/) = 1 + 4; 

(c) 421 -- ἡ) Ξ 1 - i; (4) 2χ(ά - 7) =1+4 4ϊ. 


4:19 Use Definition 4-4 and the real number axioms to prove that z1z2 = z2z 
thereby showing that complex multiplication is commutative. 


4-20 Use Definition 4-5 to prove that: 


------... 


@) =; (3) = (:); 


(Ὁ (23) = (2); (ὦ (ιζῷ = Ζιξϑ. 


Nu 


4:21 Use the real number axioms together with Definition 4-5 to prove that: 
(21+ z2+°° ++ Zn) = 21+ Ζ. +--+ 4+ Fp. 


4-22 State which of the following polynomials have at least one real root and 
which, if they have complex roots, will have them occur in complex conjugate 
pairs. If no deductions can be made about the nature of the roots, then say so. 


(a) P(z) = 25 + 1624 - 22 + 37241: 
(0) P(z) = z4 + 3z3 + 222 + 1; 
(c) P(z) ΞΞ Ζῇ + 525 — 222 + z + ἰ; 
(d) P(z) = z3 — 6z2 + 2z+ 4.. 
4:23 Given that z = 2 + 3i is a root of the polynomial 
P(z) = 24 — 4χ8 4+ 1222 + 42 — 13, 
deduce the values of the other three roots. Factorize P(z) into linear and 
quadratic factors with real coefficients. 
4:24 Given that z = 7 is a root of the polynomial 
P(z) =. 2° — 224 + 1028 — 20z? + 92 — 18, 


deduce the values of the other four roots. Factorize P(z) into linear and 
quadratic factors with real coefficients. 


Section 4:3 


4:25 Plot the following vectors Ζι and ze in the complex-plane and use geometrical 
methods to form their sum z1 + ze and their difference zi — ze: | 
(a) Ζι = 2+ 34, z2 = —1 + 21; (b) Ζι = 3, z2 = 4 —/7;: 
(c) 71 = 4ϊ, z2 = 3 — 4i; (ὦ z1 = ~1 ~ 23, z = —1 - 2). 
4:26 Find the modulus of each of these vectors: 
(a) 4— 31; (Ὁ) -24+3i; @©2- 31], (ἃ) 34+ 4%; (ὁ) 5. 
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4:27 Use Definitions 4:5 and 4-7 to prove that: 


(a) 22 =|z|?; (Ὁ) [Ζι z2| =| 21|.| ze]; 


and give an inductive proof that 


| Ζι Ζ2 " "ΖηΪ 5- ἰ Ζι [.] Ζε] "“ ] zn. 

4-28 Given that Ζι = 3 + 4i, z2 = 4 — 3], z3 = 2 + i, and z4 = 1/3 + i, use the 
results of the previous problem to compute | z1 ze|, | z1 z2z3|, and 
| Ζι Z2 23 za |. Check your results by direct computation and compare the 
relative labour of computation. 

4:29 Use the properties of the complex conjugate operation to prove that for any 
two complex numbers Ζι and ze, 

z1 Zo + Ζι z2 = 2 Re 21 Zz. 
Then, using this result together with the obvious inequality 
| Re z1 Ze | < | 21 Ze | 
and the identity 
| zi + Ze |? = (21 + Z2)(z1 + 22), 
prove the triangle inequality, 
| z1 + z2| < | z| + | z2|. 

4:30 Use the same form of argument as in Problem 4:29 together with the obvious 
inequality Re z1 Ze > — | z1 Z| to prove 

|| 21 | — | 22 || <] 21 + ze]. 

4-31 Give two examples in which the triangle inequality is strict (that is, the sign 
< is replaced by <). Give two further examples in which it reduces to an 
equality. 

4:32 Give two examples in which the inequality || z1 | — | ze || < | z1 + za | is 
strict. Give two further examples in which it reduces to an equality. 

Section 4-4 
4:33 Express these numbers in modulus—argument form: 
(a) z= —3 + 4i; (b) z= —3 — 4; 
(c) z= —3 + 3); (4) z = 21/3 — 2i. 

4-34 Express the following numbers z in real-imaginary form given that: 
(a) |z| = 4, argz = τ; (b) |z| = 2, argz=—; 

3π 4π 
( |z| = 6, argz =~; (ὦ |z| = 3, argz=- 


4:35 Use the modulus—argument representation of complex numbers to prove: 


(a) 21 22° + + Zn = 21. 22° * " Zn; (b) (15) = (2); 
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4:36 Given the following numbers z in real-imaginary form, compute the products 


4:37 


4:38 


iz. Plot the results in the complex-plane and verify that the effect of multi- 
plication by / is to rotate a vector anti-clockwise through an angle ἃ without 
change of size: 
z=3—-—2i; ze=—-24+'; zei; z=—l] —i. 
Form the products z1 z2 and the quotients z1/z2 of the following numbers 
expressed in modulus—argument form: 
(a) z1 = 3(cos ba + isin ba); z2 = (τος ἐπ 4+ isin 1 πὴ; 
(b) Ζι = 4(cos ἐπ — isin $7); ze = 2(cos}a + isin 47); 
(c) z1 = Acosta — sini); z2 = 6(cos 32/2 — isin 3π|2). 
The second-order difference equation 
Alin + bun-1 + Citn-2 = 0 
has for its general solution the expression 
in = Ady" + Big” 
whenever the characteristic equation 
ak? ++ hi+c¢=0 


has the distinct real roots 4) and 22. If b? — 4ac < 0, so that the character- 
istic equation has the complex conjugate roots 2 and /, show that if un is to 
be real, then the constants A and B must also be. complex conjugates. Hence 
show that if δ᾽ — dac < Oand|4| =r, arg 4 = 6, then the general solution 
is expressible in the form 


un = r'(C cosn0 + Dsin v0), 


where C and D are real arbitrary constants. 
Find the general solution of the following difference equation, and hence 
determine the particular solution appropriate to the stated initial conditions: 


Un — 3ν 2tn-1 + 9un-2 =O with wo = 1, μι = 3. 


Section 4:5 


4:39 
4-40 


4-41 
4-42 
4:43 
4-44 
4-45 


Use de Moivre’s theorem to express sin 74 and cos 70 in terms of powers of 
sin 8 and cos ὦ. 


Use de Moivre’s theorem to express sin 114 and cos 116 in terms of powers 
of sin 0 and cos #. 


Evaluate z2° when z = 43 +7. 

Evaluate z!'4 when z = | — ἰν 3. 

Calculate the seventh roots of unity. 

Find the roots of the equation w = (—/)?"3. 


Find the roots of the equation w = (1 + ἐν 3)! 4. 


Section 4-6 
4:46 Construct the set of cyclic permutations of the four letters a, ὁ, c, and d. 


4-47 Construct a table analogous to Table 4-1 for a left-handed system of axes. 


4-48 Determine the lengths | OP | of the vectors OP given that O is the origin 


and the points P are: 
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4-49 


4:50 


45 


—_ 


4-52 


4:53 


4:54 


4:55 


4:56 


4:57 


4:58 


4:59 


4:60 


461 


Find the lengths | OP |, the direction cosines and the angles 41, 02, 43 of the 
vectors OP, where the points P are: 
(a) (2, --, —1); (b) (4, 0, 2); (c) (=1; 2 1). 


Find the direction ratios, the direction cosines and the angles 1, 02, θ4 of 
the vectors OP, where the points P are: 


(a) (!, 1, 1); (Ὁ) (-1,1,D; © @, 1, -D. 


Determine the angles 41, @2, 03 for the vectors with the direction cosines: 


WS tess 1 1 1 1 1 v7 
ee OP: 00} τας se sae cae ἢ 
(a) ΣΝ ] ts [π- veya ΘΙ» τ τ 
Given that a vector makes acute angles with each of the coordinate axes and 


ΗΝ : I 
that its direction cosines are [ει m, wap deduce the value of »: and hence 
find the angles. Ὗ 


Use the fact that a vector makes an acute angle with each of the coordinate 
axes and that its direction ratios are 1, 2, 2 to determine the angles 61, θα, 
and 03 that it makes with the coordinate axes. 


Determine the lengths | AB! of the vectors AB, given that the end points 


A and B are: 

(a) A = (1.1, 1), B 
(b) A= (2,-1,1), B i ; 
(c) A=(—I1,3,1), B= (—2, —1, 0). 

Use your results to determine the direction cosines for each of these vectors. 


Write down the position vectors OP in terms of the unit vectors i, j, k given 
that O is the origin and the points P are: 

(a) (1,1, 1); (Ὁ) (—2,3,7); © G,—1, 11; (d) (0,1, 0). 

Write down the x, y, and z-components of these vectors: 

(a) 31] -- 21 ἘΚ; (Ὁ) -—i+ 3. + 11k; ()i-—k; (d) j + 3k. 

Form the vector a = αἱ + fj + vk, given that 

(1 — «i +  2β] + Qy — Ik = 2i + j + 3k. 

Determine the values of x, β, and » in order that: 

(1 — «i + BU — αδ)ὲ' + (ἡ — 2)Κ = LE+ 3] 4+ 2k. 


Form the sum a + b and difference a — b of the vectors: 


(a) a= 3i — 23 + k, = —i — 2] + 3k; 
(b) a= —i+ 2] —k, b= 2i — 4] + 2k; 
(c) a = 2j — 3k, b= 2i-—j+k. 


Prove from the definitions of addition and subtraction of vectors that for 
any vectors a and b 


(a)a+b=b+a and (b)a—b= —(b-— a). 


Find [ΑΒ] and the direction cosines of the vectors AB given that A and B 
are the points: 
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(4) Α Ξ ((,1,1), B=@,-1,); 
(c) Α ΞΞ (4,1, 1), B = (-—1, --Ι, —1). 


4.62 State which of the following pairs of vectors a and b are parallel and which 
are anti-parallel: 
(a) a=i— 3j+k, b = —4i + 12j — 4k; 
(b) a= —2i+ 3] —k, b= 2i — 3} +k; 
(c)a=4i—-j—3k, b= 8! — 2] — 6k; 
(4) « -ἰ - 7j+k, b = 3: + 21| + 3k. 


Section 4:7 
4:63 Express the following vectors a as the product of a scalar and a unit vector: 
(a) a = 2i-—j+3k; (b)a=3i-3j+k; (a= —ittj—te 


4-64 Find the vectors AB, and their direction cosines given that A and B have 
position vectors a and b, respectively, where 
(a) a = 3i — 3} + Sk, b=i+ 2] —k; 
(Ὁ) a= 2i+ 2] ἘΚ, b=i+ 3j + 2k. 

4-65 Verify the inequalities [1 8} —|b|| <|a+b|<|a|+|b| forthe pairs 
of vectors: 


(a) ἃ =i — 2] —k, b = 2i — 3. ἘΚ; 

(b) a = 3i — - 4] - Κι, b = 6i — 8j + 3k; 

(c) a = 2i + 3] — k, b = —6i — 9j + 3k. 
4-66 Find the angle between the vectors a and b where: 

(a)a=i+j+k, b= 2i+ j —k; 


(Ὁ) a= —i+ 2j+ 2k, b= 2] -- Ἶ -- 2Κ. 


4:67 Give two examples of pairs of vectors that are orthogonal but are not parallel 
to the vectors i, j, or k. 


4:68 Give two different proofs of the fact that scalar multiplication of vectors is 
commutative by using the two alternative definitions of the scalar product. 


4:69 Find the scalar products a .b and hence find the angle between the vectors 
a and b given that: 
(a) a= 7i — 3j +k, = —i + 2j + 2k; 
(b) a = 21 — 2j + k, = —3i — 3j + 4k; 
(C)a=i+2j+ 3k, b= —2i— 4] — 6k. 


4-70 Find unit vectors parallel to the vectors a where: 
(8) ἃ -- 2| --, 2 ἘΚ; (b) a= —3i+j+ 2k; (c) a= Τί — 2j — 3k. 


4.71 Prove the distributive law for the scalar product by using either definition of 
the scalar product. 


4:72 Form the vector products a x b if: 
(a) a =i — 2j — 4k, b = 2i — 2j + 3k; 
(b)a=—-i+4j—k, b= 3i+ 2] + 4k; 
(c) a = —2i + 4k, b = 3j — 2k. 
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4:73 Evaluate the determinants: 
(a) |2 1 (b) 4 16 (c) 
4 6 —2 6 


" 3 


2 Of aa 4 
at 1 3) 


4:74 For what values of A, if any, do these determinants vanish: 


4). λ: 2] )]42 1 ©]3 4] [31 4 
3: 1} "4: 0 2 2 a 
4.75 Evaluate the determinants: 
(4) [2 1 1 (0) [3 4 5 (Ο) ]3 4 5 
1. Ὁ 1}: 2. 2 1:}: 7. 1 ΣΙ, 
1 1 1 1 0 2 6.5 7 


4:76 For what values of 4 do the following determinants vanish: 


(a){/ A 1 2 (b)| 2 ὁ | ic) |} O 1 2 
1. AcE: 0 2/4 1/14; A 1 OO}, 
2 2 1 0 [| 3 lL ὁ 1 
4:77 Use Definition 4:18 to prove that a x b = —(b x a) for arbitrary vectors 
a and b. 


4:78 Evaluate the vector products b Χ a given that: 


(a) a = 2i — j + 2k, = —3i+ 23+ k; 
(0) a= —i+j+k, = 4 + 2] + 3k; 
(c)a=—-i-j-k, b= 2i+ 2j + 2k. 


4:79 Determine unit vectors that are normal to both vectors a and b when: 
(a) ἃ = 3i+ Sj— 2k, b=i+j+k; 
(0) a = —4i + 2k, b = j — 3k. 
State whether the results are unique and, if not, in what way are they in- 
determinate. 


4.80 Use the definition of a vector product to prove that it is distributive and so 
ax (b+c)=axb+axce. 


4.81 Use Definition 4-18 of a vector product to prove that when a and b are non- 
zero vectors, then a x b = ὁ if, and only if, a and b are parallel. 


4:82 Use Definition 4-18 to evaluate the vector products a x b given that: 
(a) a= —i+ 4] -- 2k, b= 21 - 3j +k; 
(b) a= —2i — 3j + k, b= 6i + 9] — 3k; 


(c)a = 3i—k, == 2]. 
Evaluate these same vector products using Table 4-3 and compare the effort 
involved. 


4.83 Verify the distributive property of the vector product: 
ax(b+c)=axbt+axe, 
given thata = 2i+ ἰ —k,b =i — zj + k and ς = 3i — 2j + 3k. 
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4:84 Evaluate the triple scalar products a. (Ὁ x ¢) and (Ὁ x a). given that: 
(a) a = 2i — j — 3k, b = 3k,c =i + 2] + 2k; 
(0) a=i+ 2] ἘΚ, b= 2i+j+k,c = 4i + 2j + 2k. 

4°85 Prove that if a, b, and ς form three edges of a parallelepiped all meeting at a 
common point, then the volume of this solid figure is given by |a.(b x c)]. 


Deduce that the vanishing of the triple scalar product implies that the vectors 
a, b, and ¢ are co-planar (that is, all lie in a common plane). 


4:86 Determine the vector products a x (Ὁ x c) given that: 
(a) a = 2i-j—3k,b=314j+kc=-i+j+k; 
(0) a= —i+j—k, b= 21 -- 2] 4+ 2k, ς zit Κ᾿ 


4:87 Prove that (a x b) x c= (a. c)b — (b. ca. 


Section 4-8 


4:88 Find the vector equation of the line through the point with position vector 
2i — j — 3k which is parallel to the vector i + j + k. Determine the points 
corresponding to 4 = —3, 0, 2 in the resulting equation. 


4:89 Find the vector equation of the line through the points A and B with position 
vectors a = 2i-+j—k and b= -j+ j+ 2k. Determine the direction 
cosines of this line. 


4:90 The equations 


3x +3 ~2y+1 2246 
2. ie 


determine a Straight line. Express them in vector form and find the direction 
cosines of the line. 


4-91 If the points A and B have Position vectors a and b, and point C divides the 
line AB in the ratio 2: μ, show that C has the position vector 
a + Ab 
A+ ut 
4.92 Find the vector equation of the line that passes through the point A with 


position vector a = —2j — ) + k and is normal to both the vectors b and c 
where b = i + 2] + 3k and c = -—i+j—k. 


provided A+ n 40. 


493 Find the perpendicular distance of the point 2i+ 4+ k from the line 
r= (1+ Dit (2 -- 31) + (24+ Dk. 


4.94 Find the perpendicular distance of the point i + 3j + 2k from the line 
2x~1l y+2 2-1 
2 ~ 3° 2 
4:95 Find the Cartesian equation of the plane containing the point 2i — ἢ + 2k 
and normal toi + j + k. 


496 Find the Cartesian equation of the plane containing the point 3i — k and 
also containing the two vectors a, 6 where α = i+ 2j+k and B= —j 
+ 2] + 2k. 
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4.97 Find the angle between the two planes 


CS ΟΥ̓ z—3 


4-98 Find the angle between the plane z = 2 and the plane 


4:99 


4-100 


4101 


4-102 


4-103 


Rede ory es ΒΗ 
ἜΝ ΕΣ er 


Let II be the plane r.ii = p, where ἢ is the unit normal to II and p is its 
perpendicular distance from the origin. By constructing a plane 1!’ parallel 
to II through point P with position vector a, show that the perpendicular 
distance of P from II is given by the expression | α . ἢ --- p|. What form 
would this expression take if the plane was expressed in the formr.n = gq, 
where [π| + 1. 


A line may be uniquely determined as the intersection of two planes 
r.mi = pi and r.n2= pe (A) 


where mi and ne are not necessarily unit vectors. The direction of the line is 
normal to both m and ng and so is parallel tom x ne. Hence the line has the 
equation r = a+ A(m X ng) where / is a parameter and a is some point 
common to the two planes in (A). Apply these arguments to obtain the vector 
equation of the line determined by the planes 


Ky = ΖΕ and 20 yar ee =i. 


Find the Cartesian equation of the sphere of radius 3 about the centre 
a= 2i+ 3] +k. 

Construct the Cartesian equation of the sphere of radius 4 that lies on the 
side z > 0 of the plane z = 0 and ts tangent to the point (3, 1, 0). 


The mward drawn normal to a sphere of radius 2 at the point (1, 1, 2) on its 
surface is n = 2i — j + k. Deduce its equation in Cartesian form. 


Section 4:9 


4-104 


4-105 


4-106 


Forces Εἰ, Fz, Fs, and Fa have magnitudes 21 6, 31/5, 3, and 15 lb and act 
concurrently through a point O along the lines of the vectors —i + 2j — k, 
2i + k, 2j, and 4i + 3j, respectively. Find the resultant of these forces and 
determine its magnitude in Ib. 


Forces 1, 2, and 3 act at one corner of a cube along the diagonals of the faces 
meeting at that corner. Find the magnitude of their resultant, and its inclina- 
tion to the edges of the cube. 


A sphere of 10-in radius and mass 20 [0 has one end of a string 18 in long 
attached to its surface and hangs at rest against a smooth vertical wall to 
which the other end of the string is attached. The string has a tension T and 
the wall exerts a normal reaction R at its point of contact with the sphere. 
Use a vector triangle of forces to determine 7 and R. 


4:107 


4.108 


4.109 


4.110 


4.111] 


4-112 


4-113 


4-114 
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Deduce that for three concurrent forces Εἰ, Fe, and Fs to be in equilibrium 
they must form a closed vector triangle of forces, and hence be coplanar. 
Use your result to prove Lami's theorem, which asserts that when three con- 
current forces are in equilibrium, the magnitude of each force is proportional 
to the sine of the angle opposite to it in the vector triangle of forces. 


Find the centre of mass of the masses 1, 3, 4, and 2 lb situated at points with 
the respective position vectors 3i — j + k, 21 + 2j + 2k, —1 + 7j — k, and 
4i — 10k. | 


Prove that the centre of mass of a system of masses is independent of the 
choice of origin. 

(Hint: Choose a new origin Ο' with position vector b relative to the original 
origin O and apply the definition of centre of mass.) 


The velocity of a boat relative to the water is represented by 4i + 3], and that 
of the water relative to the earth by 2i — j. What is the velocity of the boat 
relative to the earth if i and j represent velocities of 1 mile/h to the east and 
north, respectively ? 


The point of application of the force 9i + 6j + 7k moves a distance 5 ft in 
the direction of the vector 3i + j + 4k. If the modulus of the force vector is 
equal to the magnitude of the force in lb, find the work done. 


A body spins about a line through the origin parallel to the vector 2i — j + k 
at 15 rad/s. Find the angular velocity vector 82 for the body and find the 
instantaneous linear velocity of a point in the body with position vector 
i+ 2] + 3k. 


Find the torque of a force represented by 3i + 6j + k about point O given 
that it acts through the point with position vector —i + j + 2k relative to O. 


Masses I, 3, and 2 units at the points specified by the position vectors 3i — k, 
2i — 3j + k, andi + j + k relative to point O have velocities represented by 
21 + k, 31 + j + 2k, and i — j + k, respectively. Determine the vector sum 
of the moments of momentum of each of these masses about O. 


Differentiation of functions 
of one or more real 
variables 


5:1 The derivative 


The important branch of mathematics known as the calculus is concerned 
with two basic operations called differentiation and integration. These 
operations are related and both rely for their definition on the use of limits. 

The calculus was founded jointly, and independently, by Newton in 
England, and by his contemporary Leibnitz in Germany to whom we owe 
the essentials of our present day notation. In introducing the ideas underlying 
a derivative we shall make use of a simple dynamical problem in very much 
the same way that Newton did when first formulating his early ideas on 
differentiation. However we have the advantage of understanding the nature 
of a limit more clearly than was the case in his day, so that after presenting 
our heuristic argument, we shall quickly formalize it in terms of the ideas set 
down in Chapter 3. 

We shall consider how to define and determine the instantaneous speed 
of a point P moving in a non-uniform manner along a straight line. To be 
precise, we shall suppose that a fixed point O on the line has been selected, 
and that the distance s of point P from O at time ¢ is determined by the 
equation 


s= f(t), 


where f(t) is some suitable continuous function of ¢ defined on some interval 
JS, Thus we know the position of P at a general time ¢, and are required to 
use this information to define and find the speed of P at any given instant of 
time. When the motion of P is uniform, so that its displacement is proportional 
to the elapsed time, the familiar definition of speed as distance per unit time 
can be used. However if the motion is non-uniform we must consider the 
situation more carefully. We shall use intuition here and first consider the 
difference quotient 


I (te) — f(t) (5-1) 


fo — fy 


in which ἢ and fe are two different times belonging to #. 

It seems reasonable to suppose that if fg were to be taken sufficiently 
close to ft; then expression (5-1), which is the quotient of the finite distance 
travelled and the elapsed time, would in some sense provide a measure of the 
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average speed of P in the small time interval tg — t1. Even better would be the 
idea that we compute the difference quotient (5-1) not for one time fz close 
to ἢ, but for a monotonic sequence {7;} of times having for its limit the time 
t, which is not a member of the sequence. This last condition is necessary 
because Eqn (5-1) is not defined if tg = 1. Then if the sequence of difference 
quotients corresponding to Eqn (5:1) has a limit we propose to call the value 
of this limit the instantaneous speed u(t1) of P at time ἢ. 
Expressed in the symbolic form of Chapter 3 we may write 
u(ty) = lim ee 


t— © 


(5:2) 
τὶ — f1 

This definition is obviously consistent with the case of the uniform motion 
of P, for then every difference quotient involved in the determination of the 
limit (5:2) would give the same constant value uw, say. We will call this value u 
the constant speed of P. 

As the function f(t) is continuous it is clearly desirable that we define 
not in terms of the discrete variable 7; but in terms of a continuous variable τ. 
Fortunately we can do this easily, for the conditions of the connecting 
Theorem 3-6 are satisfied and allow us to rewrite Eqn (5-2) thus: 


[9 - ΠῚ. 


u(t) = lim 
το Tt 


TH 


(5:3) 


We have now dropped the suffix I since ἢ was not specific and represented 
any value of the time ¢ belonging to Z. 

It should be appreciated that the limit u(t) in Eqn (5-3) is a number and 
not a ratio of quantities as were the members of the sequence used to define 
the limit. The instantaneous speed u(t) can be interpreted as the distance 
through which P would move in unit time if, during that time, it were to move 
at a constant speed equal to the value u(t). Because Eqn (5-3) is consistent 
with the notion of a constant speed, it is customary to omit the adjective 
‘instantaneous’ and to speak only of the speed of P. 

The limit involved in Eqn (5-3) is of the indeterminate type and it will be 
our object to devise techniques for evaluating such limits for a wide class of 
functions f(¢). In trivial cases these may be determined by simple algebraic 
considerations as this example shows. 


Example 5.1. Suppose that the distance of a point P from a fixed origin at 
time ¢ is determined by the equation f(t) = kt3, where k is a constant with 
dimensions (Length)(Time)-°. Find the functional form of the speed u(t) 
at time ¢, and determine its value when ἢ = 4. 


Solution We are here required to evaluate the limit 


u(t) = lim [ΞΞ:Ξ 5} 


τ 1 - 
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which 15 the form assumed by Eqn (5-3) when f(t) = kt. 
Using the identity 73 — £3 = (τ — r)(72 + τί + 12) we may write 
ἘΞ 2 2 
iyi κ(τ — t)(r? + τί + 12) 
T—>l (τ = t) 


= lim k(r2 + τί + 12) 
Tt 


= 3kt?. 


Thus the functional form of the speed is u(t) = 3kt2, so that at t = 4 the 
speed has the value u(4) = 48k. 

It is often helpful to check the form of a result by means of dimensional 
analysis. This is achieved by representing the fundamental quantities of mass, 
length, and time occurring in expressions and equations by the symbols M, 
L, and T, and ignoring any purely numerical multipliers that may be involved. 
The equations then become identities between expressions of the form L?M'T3, 
where p, r, and s are real numbers. Quantities other than length, mass, and 
time are represented as suitable combinations of these fundamental quantities. 
Thus speed and acceleration would be written LT-! and LT-2, respectively, 
with no account being taken of their magnitudes. We illustrate this approach 
with Example 5-1. By supposition & has dimensions 1.7 3, so that from the 
form of the solution we see that u(t) must have the dimensions kT2 = (LT-3)T? 
= LT~1, which are the dimensions of speed, as required. 


Fig. 5.1 Speed interpreted as a derivative. 


There is a valuable graphical interpretation of the limit (5-3) shown in 
Fig. 5-1 which is the graph of a function f(t) together with the chord PQ, 
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where P is the point (r, f(t)) and Q the point (τ, f(7)). 

The difference quotient within the brackets of Eqn (5-3) before the limit 
is taken is the tangent of the angle QPR. In the limit as τ — 1, so the point Q 
approaches the point P and the chord PQ approaches the tangent PS to the 
curve y = f(t) at P. The value u(t) arrived at by considering the limit of the 
difference quotient (5-3) is thus the tangent of the angle SPR and so is equal 
to the gradient or slope of the curve y = f(t) at P. The number u(t) evaluated 
at any specific time f = 11 is the derivative of f(t) with respect to tat t = h. 
The limit u(t) as a function of ¢ is simply called the derivative of f(t) with 
respect to ¢ and the operation of computing the derivative of a function is 
called differentiation. A function that possesses a derivative at each point of an 
interval is said to be differentiable in that interval. Hence in Example 5-1, the 
derivative of kt? with respect to ¢ at t = 4 1s 48k, whereas the derivative of 
kt? with respect to ¢ is the function 3kt?. The function kt? is obviously 
differentiable in any finite interval. 

This heuristic approach has served to introduce the limiting arguments 
underlying the concept of a derivative, and we must now carefully reformulate 
these arguments and express them in general terms. We shall use the following 
key definitions. 


DEFINITION 5:1 A function f(x) of the real variable x will be said to be 
differentiable at xo if, and only if, 


᾿ f(x) — f(xo) 


lim ΄--- -- ---:----- 


L—>2XO X — XO 


exists and is independent of the side from which x approaches xo. More 
generally, f(x) will be said to be differentiable in an interval -f if it is differen- 
tiable at each point of %. At any points of % for which the limit is not defined 
the function f(x) will be said to be non-differentiable. 


DEFINITION 5:2 If f(x) is a differentiable function of the real variable 
x at Xo, then the value of the expression 


fm £00 -- 70) 
x79 xX — XO 


df 


will be denoted by /’(xo) or re , and we shall say that it is the derivative 
L=2XH 


of f(x) at x = xo. If further we define y by the equation y = f(x), then we 


Sete d 
can also write the derivative of f(x) at xo in the form a ; 
L=2O 

These definitions merely express in a more sophisticated way, what is 
usually put as follows. 
Let y = f(x). Then if dy is the increment in y occasioned by an increment 


182 / DIFFERENTIATION OF FUNCTIONS CH 5 


Ox in x, we have y + dy = f(x + 6x) and hence 


ὃν 7 + 6x) — 70) 
Ox Ox 


Thus at x = Xo, 


oy _ f(xo + ox) — f(xo) 


Ox Ox 
and so 
dy iin I (xo + 6x) — f(x0) 
dx a= 29 6x0 Ox 


To obtain the formulation of Definition 5-2 above, first write A in place 
of 6x to obtain 


= ims oo τ ἢ) ~ So) ; 


dy 
dx 


and then write x in place of xo + ἢ, so that ἢ = x — xo. 
What does the requirement, that lim {[ f(x) — f(x0)]/(x — xo)} should 


L—>XO 
exist, actually mean? It is this. There is a number f’(xo) such that the left- 
and right-hand limits of the function g(x) = [f(x) — f(xo)]/(x — xo) as x 
approaches xo exist and are both equal to f’(xo). The function (x) itself is 
defined near but not at x = xo but has the property that lim g(x) = f’(x0). 


+t 
We shall use this idea together with Theorem 3-4 when we discuss the general 
properties of derivatives of combinations of functions. 
If in Definition 5-2 we write x9 + ἡ in place of x, and replace xo by x in 
the subsequent result, we may formulate this definition. 


DEFINITION 5:3 If τῷ f(x) is a differentiable function of the real variable 
x at all points of an interval .%, then the derivative of f(x) in 4 is the function 
denoted either by /’(x) or dy/dx and defined by 


i ae TO I JOD 
7) Ξ τΞϑ = lim 


h—0 h 


The operation of computing the derivative of a function is differentiation. 


Let-us now apply exactly the same arguments to Fig. 5-2 as were used in 
connection with the speed at a point of the particle trajectory in Fig. 5-1. 
This time the graph represents any function y = f(x) satisfying the conditions 
of Definition 5-3. Then if P is any point in the interval within which f(x) is 
differentiable, and Q is an adjacent point, the chord PQ is, in some sense, an 
approximation to the tangent line to the curve PR at P. The limiting position 
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Sx +h)—fo) 


Fig. 5.2 Derivative interpreted as a gradient. 


of the chord PQ will lie along the tangent line to the curve at P and in terms 
of angles we have lim 6 = «. However, 


Q—-P 
τ ΞΟ ea) = tan θ 
h 
so that 
lim? eae) = lim tan 6 
h-0 h h—0 


whence, finally, 
f(x) = tan «, (5-4) 


or, equivalently, 
— = tana. > (5:5) 


This result shows that we may interpret the derivative of ἃ differentiable 
function at a point as the gradient of the tangent line drawn to the curve at 
that point. It is implicit in the definition that the tangent line so defined should 
be independent of whether Q approaches P from the left or right. 

The geometrical interpretation of a derivative allows us to see quite 
clearly that in addition to the function needing to be continuous in the 
neighbourhood of a point at which it is required to be differentiable, it also 
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a 


ἘΝ 
δ 
x 


Fig. 5.3 Non-differentiable function at x = x1 and x = xe. 


needs a special kind ot smoothness. Specifically, the left- and right-hand 
tangents to the curve at the point in question must be one and the same. 
Indeed, we could re-phrase our definition of differentiability in terms of the 
equality of the left- and right-hand tangents at a point on the curve, just as we 
did when dealing with continuity. 

Consider the function f = f(x) shown in Fig. 5:3 and defined on the 
interval [xo, x3], but only continuous in the semi-open intervals [xo, x2) 
and (x2, Δ]. 

Then, despite the fact that the function f(x) is continuous in [xo, x2) 
and (x2, xs], it is only possible to assert that tangent lines in the sense implied 
by Definition 5-3 can be constructed for points in the open intervals (xo, x1), 
(x1, X2), and (x2, x3). No tangent line can be constructed at x2 because of the 
discontinuity; two tangent lines /, and /2 can be constructed at point P 
according as A and B approach P from the left and the right; whilst only 
right- and left-hand tangents /3 and [4 can be constructed at the end points 
Xo and x3 because the function f(x) is not defined outside [xo, x3]. 

We shall now show how Definition 5-3 may be used to determine the 
derivative of a function and also to prove its non-differentiability at a certain 
point. Our example is a continuous function whose behaviour is clear at all 
points other than the origin, at which the existence, or otherwise, of a tangent 
line to the curve cannot be deduced by inspection of its graph. 


Example 5-2 Prove that the function f defined by f(x) = x sin (1/x) for 
x ~ 0 and f{(0) = 0 is continuous in (— οὐ, 00) and sketch its graph. Find its 
derivative by use of Definition 5:3 and show that it is not differentiable at 
the origin. 


SEC 5-1 THE DERIVATIVE / 185 


Fig. 5.4. The function y = x sin (1/x). 


Solution Only the behaviour of fin the vicinity of the origin is in doubt here. 
When x 4 0 we may write f(x) = [sin (1/x)]/C1/x) showing that for large x, 


(x) behaves like lim (sin h)/h = 1. Conversely, as the origin is approached, 
h—0 


so x —> 0 and because sin (1/x) is bounded by +1 it follows that lim f(x) = 0. 


x—+0 


The limit of the function f(x) at the origin is thus equal to the functional value 
itself and so f(x) is continuous at the origin. It is clearly continuous elsewhere 
since it is the product of two continuous functions. Hence it is everywhere 
continuous and Fig. 5-4 shows its graph, which is symmetric about the y-axis 
because f(x) is an even function. 

We shall approach the differentiability question in two Stages: first for 
x γέ 0, and then for x = 0. Assuming x 0 and making a direct application 
of Definition 5-3 we obtain 


] Ι 

(x + A) sin Sa = x sin τ 

‘(x) = lim 
I ) h—0 h 
which we re-express as 
] h\-1 

(x + Δ) βίη {-π|1- -- π᾿ 
, x x x] 
=a | ----- “ 


Now for ἢ close to zero we may use the binomial theorem together with our 
‘little oh’ notation of Section 3-4, to write [1 + (A/x)]-! = 1 ~ (h/x) + o(h) 
as h + 0, and hence 
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. ΠῚ h 
(x + h) sin - ( —-+ o(H)) — eine 
; x x x 
f(x) = lin |] -——ADAD  — _—_——_|- 
h—0 h 


Next we write the argument of the sine function as 


[C/x) — (A/x*) + [o(A)]/*] 


and use the trigonometric expansion for the difference of two angles to obtain 


0) = 
(x + h) sin ee (= - ) — ee sin (= - “Ὁ}} —x ane 
. x x x x x x x 
lim 7 || 
h->0 


Consider the behaviour of the terms comprising this quotient. If the first 
and last terms are taken together then in the limit as h --» Ο they reduce to 
the single term sin (1/x). The remaining term in the centre is 


tlein (2 ο(ὦδ 
τῷ +h) cos = baa Newgate 
h 
and since x ~ 0 is fixed, it follows from limit (3-9) that this reduces to 
1 ᾿ 
— -᾿ COs - 
Χ Χ 
as ἢ --»ο. 


Combining these two results we find that the derivative f’(x) is 
ποῖος οἱ Ι 
f'(x) = sin - — -- cos - forx £0. 
ee ΜΝ: 


Thus we have used Definition 5-3 to compute the derivative, and as this is 
defined for all x 40 it follows that y = x sin (1/x) is differentiable for all 
such x. 

Finally we must examine the behaviour of the derivative at the origin 
using Definition 5-3. Setting x = 0 we obtain 


hsin (1/h) — 0 
h 


= lim sin (1/A). 
h—0 


f'(0) = lim 
h-—0 


As sin (1/h) oscillates boundedly with ever increasing frequency when 
h — 0, it follows that f’(0) is not defined. This establishes the non-differenti- 
ability of f(x) at the origin as was required. 
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We close this section by deducing the derivatives of some important 

elementary functions, and stating them as theorems. 

THEOREM 5:1 The derivative of a constant function 15 zero. 


Proof Let k be any constant and consider the function f(x) where f(x) = k 
for all x. Then 


fe th) —f@) κ- 


Ἶ Ἶ = Ὁ for all x. 
Hence 
h) — 
lim + OS) = 0 for all x. 
,--Ὁ 


THEOREM 5:2 If n is any positive integer, then the real function y = x” 
is differentiable everywhere and has the derivative dy/dx = nx"-!. If m is 
any negative integer, then the function y = x™ is differentiable everywhere 
except at the origin and has the derivative dy/dx = mx™-1, 


Proof We must first consider the limit of the difference quotient 
[(x + h)* — χη ἢ. By the binomial theorem we have 


(x + hy" — x” 
h 
n(n — 1) n 
x”? + nx"1h + "gp: ee ao xn-rhr tes + 4 fm — xn 
: r 


nai. Πα -- 1) 
hy ὡς τὰ 


xn-2h +++ 4 (”) xu—-rpr-1 eo ἐπι1͵ 


Now lim ἢ}: = Ο so lim A’ = Ο for 1 << r<in — ἰ ἀπά 50 
h—-0 h->0 


lim (") gn-thr-l = 0. 
r 


Consequently, 


n— νὮ 
lim ee —_— nxt, 
h—0 


This is defined for all finite x including x = 0 and so proves the first part of 
the theorem. Next let m = —n. Then 


hea (ey ae. (pe Ι 
h 7 h ἬΝ ἢ χα + hyn 


Now from our result above 
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nm ...- 
lim Ἢ (x + hy = —nx?-1 
whilst 


lim (x + h) = x and so lim (x + A)? = x”, 
h—0 h-—0 


If x #0, 
- Ι 1 1 
PUY a ee 
roo χα +A lim x” .lim(x + h)® χϑπ 
h—0 h-+0 
Thus 
ἢν" — xm 
lim (x + h) x eee eee 
h—>0 h x 


Hence we have proved that 


d 
πα. n 
τς Ὁ 


dy 
dx 
if m is a positive integer, and for all non-zero x» if n is a negative integer. 


Later we shall prove this result for all real ». Henceforth we shall use the 
result freely, irrespective of the value of ἡ. 


= nxo”—! for all xo 


τξξζῃ t=O 


THEOREM 5:3. The functions sin ax and cos «x of the real variable x, 
where « is any real number, are differentiable everywhere and 


ἽΝ 4 ; 
— (sin ax) = «COS ax — (COS ax) = —« SIN ax. 
dx dx 


Proof These results follow by applying Definition 5-3 and then using limits 
(3-9) and (3-10). Thus we have 


άἀ . sin a(x + Ah) — sin ax 
— (sin ax) = hm ------------- 
dx h—0 h 


= lim 


= ax cos αὐ + cos ax sin ah — sin ad 
h—0 


h 


sin | 
h 


cos ah — 1 
= sin «x lim (ΞΞ -- + cos «x lim ( 
λ-»0 h A—0 


=(+ «cos ax. 


As this function is defined for all finite x, the first part of the required result 
has been established. The remainder of the proof follows exactly similar lines, 
and so will be omitted. | 


SEC 5:2 RULES OF DIFFERENTIATION / 189 


Example 5.3 Find the derivatives of the following functions stating any 
point at which they are not differentiable. 

3 for —-a<x<l 
Oe ᾿ forl<x< ow. 


(Ὁ) f(x) = χϑ for all x. 


—3 
0 seo = fara 


(d) f(x) = sin 4x. 
(e) f(x) = cos 7x. 


Solution (a) By virtue of Theorem 5:1, the function f(x) has a zero deriva- 
tive for all x except at the point x = 1 where it is not defined. 

(b) From Theorem 5:2 we have dy/dx = 5x4 for all x. 

(c) From Theorem 5:2 we have dy/dx = —3x-4 for x #0, and the 
derivative is not defined at x = 0. 

(4) and (6) From Theorem 5:3 we have 


d d 
— (sin4x) = 4cos4x —(cos 7x) = —7sin 7x for all x. 
dx dx 


By now it is obvious that Definition 5-3 is a working definition that can 
be used. However, some better method than its direct application is obviously 
needed to compute derivatives of complicated functions. This requirement will 
be systematically pursued in the next section. 


5:2 Rules of differentiation 


The complicated functions that occur in mathematical and physical studies 
are invariably the result of forming sums, products, and quotients of simple 
algebraic and trigonometric functions. This suggests that our next task should 
comprise a general study of the operation of differentiation when applied to 
sums, products, and quotients of arbitrary differentiable functions. We will 
present our results in the form of basic theorems which must become 
thoroughly familiar to the reader. 


THEOREM 5-4 (differentiation of a sum) If f(x) and g(x) are real valued 
functions of x, differentiable at xo, and k; and ke are constants, then the 
linear combination kif(x) + keg(x) is also differentiable at xo. Furthermore, 


== kif’(xo) + keg'(x0). 


Ξε Ὁ 


d 
re (kif(x) + keg(x)) 


Proof Here we must apply Definition 5:3 to the linear combination 
kif(x) + keg(x). We obtain 


< (kif(x) + kog(x) 


φξξῖο 
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= tm ἔν + A) + κίον + 1) -- [αὼ + keel) 
h—0 h 


= ον αι} 09 +) -- fo) 
h-—0 h 


= Κι (Χο) + kog'(xo). 


If f and g are both differentiable in some common interval .%, then the 
above argument when applied to each point of / yields the result 


+ kolim 


ἦι--Ὁ 


2(xo + A) — g(x0) 
ἢ 


Φ Κι ΤῸ + keg] = ΚΓ) + kag" 


where x is any point of 4%. The constants ki and ke are often absorbed into 
the functions fand g, when the result could be expressed ‘the derivative of a 
sum of functions is equal to the sum of their derivatives’. The task of showing 
that this result is true for a linear combination of an arbitrary number of 
differentiable functions is left to the reader as an exercise involving proof by 
induction. 


Example 5.4 Let us use Theorem 5-4 to compute the derivative of 
J (x) = sin? x, 


Solution As it stands we cannot differentiate f(x). However by a well known 
trigonometric identity we may transform f(x) to the form 


f(x) = 41 — cos 2x), 
when Theorem 5-4 becomes applicable. Then, using our earlier results 


concerning the differentiation of a constant and of cos «x we find that 


d d 
ae (sin? x) = τς (4 — cos 2x)} 
d d 
= = ᾧ -- & (4008 2x) 


S04 


a (cos 2x) 


= — 4,(—2)sin2x 
= 2sin x cos x. 


THEOREM 55 (differentiation of a product) If f(x) and g(x) are differenti- 
able real valued functions at xo, then so also is the product function f(x) g(x). 
Furthermore, 


1--.210 


- (fx)g))}| = f'(x0) ge (xo) + Λαο) ΄ Οὐ. | 
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Proof Again we consider a difference quotient but this time, for economy 
of expression, use the form of limit given in Definition 5-2. We have the 
identity 


ed oe ΩΤ @) {5 Ξ eo) 


X — Χο χΧ- A 


f (xo). ὦ 


Now we wish to show that a 2 (x) = f(Xo). This would be true if f(x) were 


continuous but we only know that it is differentiable and as yet do not 
know that this implies continuity. We shall prove that it does. As f(x) is 
differentiable at x = x9 we must have 


F(x) — 710) 


= (Χο) + oh) as Xx —> Χο, 
x — Xo 


where h = x — Xp. Hence 


T(x) — 7300) = (& — xo) Lf’ (xo) + ofA) as X > Xo. 


This implies that if x is taken sufficiently close to xo then the difference 
(x) — f(xo) can be made arbitrarily small. This is just our definition of 
continuity and so we have proved that differentiability of f(x) at xo implies 
its continuity at that point. Thus we are permitted to write 


lim f(x) = ζ( Ὁ) 


tI 


and, similarly, 


lim g(x) = g(%o). 


t->TZQ 


Now 
lim (a) = f'(xo), lim je = se) = 2'(xo), 


so, finally, taking the limit of (I) as x — xo, we obtain the result 


σε) = Seog (xo) + flag Οὐ. 


L=29 


Again, if f and g are both differentiable in some common interval % 
then, as before, we obtain the more general result 


= (fo) 80) πε +/0)e'@) for xe. 


As an incidental detail of this proof we have shown that differentiability 
at a point implies continuity. This result is worth stating formally. 
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THEOREM 56. If a real valued function f(x) is differentiable at the point 
Xo, then it is also continuous there. The converse result is not true. 


Proof It only remains to prove that the converse result is not true: namely, 
that continuity does not imply differentiability. This has already been seen 
in connection with Fig. 5-3, but let us give a specific example. Our final 
assertion in Theorem 5:6 will be valid even if we can produce only one 
example of a function that is continuous at a point but is not differentiable 
there. Such an example used to prove the falsity of an assertion is a counter- 
example, and in this case we choose the function f(x) = |x|. This is known 
to be continuous at x = 0, but the derivative as defined in Definition 5-3 
is not defined at the origin so the function is not differentiable at that point. 


Example 5.5 Differentiate the function f(x) = sin? x and compute f’(47/). 
Solution We express the function as a product and use Theorem 5:5. 


d 
= (sin? x) = in (sin x . sin x) 


d , 
= Ε (sin Ἵ sin x + sin x Ε (sin Ἵ 


x dx 
d 
= 2 sin x Ε (sin »)| 
dx 
= 2s1In x COS x. 


As would be expected, this verifies the result of Example 5-4. Finally, 
using this expression we compute 


d 
ae (sin? x) =Z sin cost ΞΕΞῚ 1, 


2:5Ξ:}π 

Our next theorem is important and concerns the rule for differentiating a 
composite function or, more simply, the rule for the differentiation of a 
function of a function. 


THEOREM 5-7 (differentiation of composite functions) If g(x) is a real 
valued differentiable function at x = xo and f(u) is a real valued differentiable 
function at u = g(xo), then f[g(x)] is differentiable at x = xo. Furthermore, 


= {flg@o} = f'[g(xo)] . σ΄ Οὐ). 


%—=7H 


Proof We have the obvious result 
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Since g(x) is differentiable at xo it is continuous there, and so g(x) —> 2(x0) 
as X > Xo. So, writing g(x) = u, g(xo) = a we have 


ΠΒΟῚ = flg@ol _ fe) -- (ὦ εὦ -- ge) (A) 
— ¥ xX! u—d x — Xo 


Now for ease of argument we shall assume the behaviour of 2(x) to be 
strictly monotonic in some neighbourhood of xo, so that g(x) = g(xo) only 
when x = xo. In these circumstances the difference quotients on the right-hand 
side of (A) are well defined as x — xo so that we may take limits and obtain 


d a flgGol — flexo) 
ας J leon a a eee -- 
ο΄, FM-S/@) , Fe) — g(x) 
= lim | | sim | xX — Xp | 
= f"(a) . g'(x0) 
= f’[2(xo)] . 2'(x0). | (B) 


It is not difficult to show that the theorem is still true when g(x) is not 
monotonic in some neighbourhood of xo and an infinite Sequence of points 
{xi} exist with limit point xo at all of which 2 (xi) = g(x0). 

All that is necessary here is to observe that if x + xo through the suc- 


cessive values x; of this sequence, then 2(x:) — g(xo) = 0 and so 
gl) ~ 8) = 0 for every 1. 
Xi τ Xo 


Hence, by Theorem 3:6, it follows that 


d 
re {2(x)} = 0. 


L=2y 
However, by the same argument, 


S lexi] — fig(xo)] = 


0 for every i, 
Xi — XO 


showing that 


d 
το υ͵80.}}} τς 0, 


ΞΞ Ὁ 


and so result (B) is also valid in this case. 
If (B) is true at each point of some interval .7, then we have the general 
result 


d ? / 
ἃς “6 0}}} =f le) - 3’, 
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When the substitution u = g(x) is made, this result can be written: 


d _ df du 
πὶ a ©) 


In this form the theorem is known as the chain rule for differentiation, 
and it is this result that is most often found in textbooks. By repeated applica- 
tion, the chain rule readily extends to enable the differentiation of more 
complicated composite functions such as the triple composite function 
Figlh@d]}, always provided the functions f, g, and ἢ have suitable differenti- 
ability properties. In this case, setting v = h(x) and u = g(v) result (5-6) 
takes the form 


τ fa = LSS. (57) 


Further extensions of the same kind are obviously possible and are 
left to the reader. 


Example 5-6 Differentiate the following functions and find the values of 
their derivatives at x = 1: 


(a) sin (x? + 3); 
(b) (x8 +x + 1)1/3; 
(c) sin »/(1 + x?). 


Solution (a) Setu = x* + 3 so that 


d ας 
— fein (x2 te 
re [sin (x2 + 3)] ἘΠῚ (sin u) 
From the chain rule: 
d d.. du 
— {sin (x2 ever δ - ον; 
a (sin (x? + 3)] “ (sin μ) ae 
Now (d/du)(sin u) = cos u, du/dx = 2x so that 
᾿ | 
— [sin (x? + 3)] = (cos u) . 2x 
dx 


= 2x cos (x2 + 3). 


Hence at x = l, 


= [sin (x2 + 3)] = 2 cos 4. 
dx 1 


t= 


(b) This time set u = x? ++ x + 1, 
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d d 
aly er) 1)1/3] = — (y1/). 
= [Ὁ + x + 19] => (1) 
From the chain rule: 
a Ts 1,3].-- τ ἐλᾶν. ΣΦ. 
< [oS $x$ DN] = S (ails) ἐν 
Hence as (d/du)(u1/%) = 4u-2/8, du/dx = 3x? + 1 we obtain 


£ οὐ + x + ἡμὴ = ur ®), Bx? + 1) 


= 43x" + 1). OO? +x 2-*. 
Thus when x = 1, 


4 


d 
τ- [αὐ Ἐπ Ἐ 0) =, 


(c) We must use the extension of the chain rule given in Eqn (5-7). Set 
v= 1+ x? when sin ψ (] + x?) = sin νυ, and u = νυ when sin ν (Ϊ + x?) 
= sin u. 


Then 


— [sin /( + x2} = — ~ (sin wt) 


dv dx 
_ cos ut dv 
" dv dx 
However, 
dv du 1 
ea ἢ ee ee 
τ NS ee τι τὸ 


so that, combining all the results, 


& sin νὰ + x9] = SEVEF*) 


ν( + x?) 
Whence at x = 1, 
d cos 4/2 
= 2 = 
ἘΝ [sin ΜᾺ + x?)] hee V2 


THEOREM 58 (differentiation of a quotient) If f(x) and g(x) are real 
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valued differentiable functions at xo and g(xo) #0, then the quotient 
ΚΑῚ ΡΟ) is differentiable at xo. Furthermore 


4 a _ &(xo)f" (x0) -- ς΄ Οὐ) 0) 
dx 15(.}}},- [g (xo)]* 


Proof If we consider the quotient f(x)/g(x) to be the product of the two 
functions f(x) and 1/g(x), we have by Theorem 5:5 


d aa I d Ι 
ἃ [Χ}} pep 9 [11 
dx (50 }}},...0. 80) 7 TI 2] Nan 
Now we must compute (d/dx)(1/g). We set g(x) = u when, from the chain 
rule, 
ἘΠῚ eral 
dx 2(x) is dx Lu L=2XO 
2 1 du 
μ OX | eee 
= —8'(%0). 
[g(xo)]? 


Hence, combining our results, we obtain the desired result 


qd ΓΟ _ glxo)f"(xo) — g’(xo)f Χο) 
dx Le(x)J lr=25 [g(xo)]? 


As in the other cases the general result follows when the conditions of 
the theorem are satisfied throughout some interval .7. It has the obvious 
form 


d τ ὦ - 70) 


dx Lg(x)J [ΟἹ] 


Example 5-7 Differentiate (3x + 1)/(x? --- 2) and determine the values of 
x for which the derivative is not defined. 


Solution Set f(x) =3x +1 and g(x) = χϑ —2. Then f(x) =3 and 
g(x) = 2x for all x, whilst g(x) =0 for x = 41/2. Hence applying 
Theorem 5-8 we have | 


d Ee 4 7 (= 2).3 — 20Bx + 1) 
dx 


χϑ.. 2] "= (x2 — 2)? 


7 i + 2x + 7 
(ἘΞ 2 τὰ 
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provided x A ++/2. 
To complete this section, Table 5-1 summarizes the results of differentiating 


the trigonometric functions. Unfamiliar results may be deduced by directly 
applying Theorem 5:8 to the definitions of the functions concerned. 


Table 5.1 Derivatives of trigonometric functions 


d d : d 

— (sin x) = cos x — (cos x) = —sinx — (tan x) = sec? x 

d d d : 
—-(cosec x) = —cosecxcotx — (secx) =secxtanx — (cot x) = —cosec’ x 
dx dx dx 


5.3 Some important consequences of differentiability 


We preface this section by proving a result that belongs more properly to 
Chapter 3 since it depends for its validity only on the property of continuity. 
Our sole reason for discussing it here is to present it in the context in which it 
will first be used. It is usually known by the name of the intermediate value 
theorem and we shall now show that the idea underlying it is extremely 
simple. 

Consider the situation in which a recording thermometer attached to 
some piece of equipment records its temperature at pre-assigned times. 
Suppose, for instance, that at times t; and fz the temperatures recorded were 
Τι and 7, respectively. Then although there is no record of the variation of 
the temperature T(t) at times ¢ between ἢ and fe, it may be safely inferred 
that the temperature will pass at least once through each intermediate value 
between ΤΊ and 75. It is quite possible for the temperature to assume values 
that do not lie between 7; and 7:5, but no assertion can be made about such 
an event. The situation 1s illustrated in Fig. 5-5 where 7* is a typical tempera- 
ture intermediate between 7, and 7», and the dotted and solid lines 
represent two possible temperature variations with time. 

This physical situation is an example of the operation of the intermediate 
value theorem in everyday life, and we are able to make our assertion because 
we know from experience that however rapidly a temperature may change, 
it can never undergo an abrupt jump. In mathematical terms we are saying 
that temperature change must be a continuous process. Expressed like this 
the result seems obvious, but how may we prove it? Our simple proof relies 
on the postulate of Section 3-2, which asserts that every bounded monotonic 
sequence tends to a limit, but first we state the formal result. 


THEOREM 5:9 (intermediate value theorem) Let the real valued function 
J(x) be continuous on the closed interval [a, b] and such that f(a) + f(6). 
Then if y* is any number intermediate between f(a) and f(b), there exists a 
number x* between a and ὁ such that y* = f(x*). 
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Proof Although a diagram is not essential for this proof, the representative 
situation shown in Fig. 5-6 will be of help. 

First set x; = 3(a + δ), then if f(x,) = y* the result is proved. If not 
consider the intervals (a, x1), (x1, δ). Then in one of these two intervals, y* 
will lie between the functional values occurring at either end of the interval. 
Call this interval /; and let it be represented by the open interval (αι, δι). 
Thus in Fig. 5-6, 7; is the right-hand interval and so in that case a1 = }(a + b), 
δι = ὃ. 

Next set x2 = 3(a, + δι). If f(x.) = y* the result is proved. Jf not con- 
sider the intervals (a1, x2), (x2, δι). Then in one of these two intervals, ye 
will lie between the functional values occurring at either end of the interval. 
Call this interval Jz and let it be represented by the open interval (as, be). 
in Fig. 5-6 the interval 75 is the left-hand sub-interval of 41, so that az = a1, 
be = 3(a1 + di). 

We either prove the result directly for some xn or we define an infinite 
sequence of open intervals ἢ > Ig > 15 >. . .. Because each interval is 
contained by all its predecessors it then follows that the sequence of numbers 
ai, a2, a3, . .. 15. monotonic increasing and bounded above whilst the 
sequence of numbers δι, be, b3, . . . is monotonic decreasing and bounded 
below. Hence by the postulate of Section 3-2, the sequences {a;} and {b;} both 
tend to a limit. That they both tend to the same limit follows from the fact 
that the length of the mth interval J, 15 (ὁ — a)/2”, which tends to zero as 
n—»oo. Letting the common value of these two limits be denoted by x* 


Fig. 5:5. Physical illustration of intermediate value theorem. 
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we have lim | f(an) — f(x*)| = 0, thereby showing the existence of the 


required number x*. 
The following is an obvious consequence of the intermediate value 


theorem: 


Corollary 5-9 Every function that is continuous in a closed interval attains 
both its greatest and least values at points of that interval. These values may 
occur at the end points of the interval. 


νά 


Fig. 56:6 Intermediate value theorem. 


5:3 (a) Maxima and minima 


One of the most familiar and useful applications of differentiation is to the 
problem of determining those points in some interval [a,b] at which a 
function f(x) assumes its maximum and minimum values. Collectively these 
values are known as the extrema of the function f(x) on the interval [a, δ] and 
they are of various types as this definition indicates. 


DEFINITION 5-4 (extrema) Let f(x) be a continuous function defined on 
the interval [a, δ] so that it attains its greatest and least values at points of 
that interval. Then we say that the point xo belonging to [a, ὁ] is: 


(a) an absolute maximum if f(xo) > f(x) for all points x in [a, 6]; 
(b) an absolute minimum if f(x0) < f(x) for all points x in [a, δ]: 
(c) a relative maximum if (Χο + h) — f(xo) < 0 for [A sufficiently small; 
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(4) a relative minimum if f(xo + h) — f(xo) => 0 for |A| sufficiently small. 


No assumption of differentiability has been made when formulating this 
definition so that in Fig. 5-7, point P is an absolute maximum and both 
points R and T are relative maxima. Point Q is an absolute minimum and 
point S a relative minimum. Although the functional value at U lies inter- 
mediate between those at Q and S, it is not a relative minimum in the sense 
of the definition, because it lies at the end of the domain of definition [a, 5] 
so that only the one-sided behaviour of the function is known there with 
respect to ἢ. 


Fig. 5:7 Extrema of a function on [a, 5]. 


If now, in addition to continuity, we also require of f(x) that it be differen- 
tiable at the point xo occurring in Definition 5-4, we can easily devise a simple 
test to identify the points where extrema must occur. Consider point P in 
Fig. 5-7 as representative of a maximum at which the function is differentiable. 
The fact that P happens to be an absolute maximum is immaterial for the 
subsequent argument. 

By supposition, if fis differentiable at P, the expression 


Ε (x) —f cay 


X — X0 


f (Χο) = lim 
must be independent of the manner of approach of x to xo. Now for maxima 
of types (a) and (c) we have f(x) — f(xo) < 0, and hence it follows that when 
x < xo, (Χο) is the limit of an essentially positive function; whereas when 
x > xo, [(Ἕ is the limit of an essentially negative function. Clearly this is 
only possible if f’(xo) = 0. We have thus proved that if fis differentiable at 
xo, then a necessary condition that fshould have a maximum at xo is ΚΓ (Ὁ) = 0. 
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Similar reasoning establishes that the condition f’(xo) = Ὁ is also a neces- 
sary condition for the differentiable function fto have a minimum at x9. To 
show that the vanishing of the derivative f’ at a point is not a sufficient 
condition for that point to be an extremum, we appeal to a counter-example. 
The function f = x® has a continuous derivative f’ = 3x? which vanishes at 
the origin. Nevertheless, fis negative for x < 0 and [15 positive for x > 0, 
thereby showing that despite the vanishing of the derivative, neither a 
maximum nor a minimum of the function can occur at the origin. Later we 
shall identify behaviour of this nature as typical of a point of inflection with a 
horizontal tangent. Generally speaking, a point of inflection is a demarcation 
point on the graph of a differentiable function separating a region of con- 
vexity from a region of concavity. Collectively the points at which the deriva- 
tive vanishes, regardless of whether or not they are maxima, minima, or points 
of inflection are called critical points or stationary points of the function. 

Combining the previous results, and recalling that the condition that f be 
differentiable at xo precludes behaviour of the type encountered at point T 
in Fig. 5-7, we are able to formulate the following general result. 


THEOREM 5-10 Let f be a real valued differentiable function on some 
interval [a, δ]. Then the stationary points of f are the numbers & for which 


f'(§) = 9. 


Once the stationary points of a function have been determined it is 
necessary to examine the functional behaviour in the vicinity of each one in 
order to determine the nature of the point involved. An absolute maximum 
is identified from amongst the relative maxima by direct comparison of the 
functional values at the stationary points in question. A similar process 
identifies an absolute minimum. 


Example 5:8 Without appealing to graphical ideas, find the location and 
nature of the extrema of the following two functions and determine if they 
are differentiable at these points: 


(a) f(x) = 4x3 + 2x24 3x 4-1; 
(0) f(x) = (2x — 5)x?/8, 


Solution (a) The stationary points are determined by finding those values 
x = & for which the derivative f’ vanishes. 

Now 7΄ = x? + 4x + 3 and so the desired stationary points are given by 
the roots of the equation 


ἐξ Ὁ 4ξ - 3 -Ξ- 0. 


These roots are 6 τῷ —1 and & = ---3, and the functional values at the 
respective points are f(—1) = —4 and f(—3) = 1. As the derivative f’ is the 
sum of continuous functions it is everywhere continuous, so that no cusp-like 
behaviour with associated extrema as typified by point T in Fig. 5-7 can arise. 
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So the two points € = —1 and ἐ = —3 are the only ones at which stationary 
values can occur. An examination of the behaviour of the function near these 
points will determine if these stationary values correspond to maxima, 
minima, or points of inflection. 

A sketch graph would quickly show that in fact € = —3 corresponds to 
a local maximum and = —1 to a local minimum, but we are specifically 
required to establish these results by analytical means. How then can we do 
this? The solution lies in a direct application of Definition 5-4, and we 
illustrate the argument by considering the stationary point ᾧ = —1. To find 
the behaviour of f close to = —1 we shall set x = —1 + 4A, where ἢ ts 
small, and substitute in f(x) to obtain 


f(-1 +h) = -1 +A) + A-1 + 2 4+ 3(Ξ΄1 +A) $1, 


whence, 
3 


h 
7.5 ane 
Now /(—1) = --ἰ so that we may also write this result in the form 


f(-l+4-f(-)D= ne a 1) 


é 


Clearly for |A| small, the right-hand side is essentially positive, and so we 
have succeeded in showing that close to § = --ἰ, 


{E+ — fl) > 0, 


and so by Definition 5-4 (4) the stationary point & = —1, at which f(é) 
=: —1, is seen to be a local minimum. An exactly similar argument will 
establish that the stationary point ᾧ = —3, at which f(&) = 1, is a local 
maximum. These are only local extrema because it is possible to find values 


of x for which f > 1 and f< —3. 


Solution (b) This case is more complicated. We have 
2(2x — 5) 
31/3 


showing that the stationary points of f are determined by the roots of the 
equation 


oe 


— 
—_— πὰ 


2(2€ — 5) 
3g1/3 
This has the single root ξ = | at which f(1) = —3, showing that the function 


has only one stationary point. To determine the nature of this point let us set 
x = 1-+ A, where |h| is small, and substitute into f(x) to find 


0 = 262/38 4 
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fl +h) = (2h — 31 + 2. 


Next we expand the factor (1 + 4)?/3 by the binomial theorem as far as 
terms involving A? to obtain 


ΚΙ + A) = Qh — 3Χ1 + 3h — gh? + O(h*)) 
or, 
Κι +h) = —3 + $h2 + (08). 
Using the fact that Χ(1) = —3 this becomes 
fl + A) — fC) = ἐμ + O(H3) 


showing that close to §=1, f(é + ἢ) — f(§) > 0. Hence by Definition 
5-4 (d), the stationary point = | is seen to correspond to a local minimum. 
Again, it is only a local minimum because for large negative x we have 
< —3. 

We now observe that f” is defined for all x other than for x = 0, at which 
point f(0) = 0. The behaviour of the function in the vicinity of the origin 
needs examination since, as it is not differentiable there, Theorem 5-10 can 
provide no information about that point. Set x = A, where ἢ} is small, and 
substitute in f to get 


fh) = Qh — 5)h?’8, | 
Now /(0) = 0, so that we may rewrite this as 


Fla) — fO) = (2h — 5)h?’8, 


thereby showing that as the right-hand side is essentially negative for suitably 
small h, close to ξ = ὁ we have f(€ + h) — [(ζ) < 0. From Definition 5-4 (c) 
we now see that the origin is a local maximum, despite the fact that fis not 
differentiable at that point. It is only a local maximum because for large 
positive x we have f > /(0). For reference purposes the function is shownin 
Fig. 5:8. 


The method of classification of stationary points that we have just illus- 
trated is always applicable, though it provides more information than is 
often required. This is so because not only does it discriminate between 
maxima and minima, but it also provides the approximate behaviour of the 
function close to the point in question. We shall return to this problem later 
to provide much simpler criteria by which the nature of stationary points 
may be identified. 


5:3 (8) Rolle’s theorem 
One form of Rolle’s theorem may be stated as follows. 


THEOREM 5:11] Let f be a real valued function that is continuous on the 
closed interval [a,b] and differentiable at all points of the open interval 
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Fig. 5.8 y = (2x — 5)x2/8, 


(a, δ). Then if f(a) = f(b) there is at least one point x = é interior to (a, δ) 
at which f’(é) = 0. 


Proof We know from Corollary 5-9 that a continuous function f(x) defined 
on the closed interval (a, δ] must attain its maximum value M and its mini- 
mum value m at points of [α, 6]. Then if m= M on [a, d], the function 
f(x) = constant, and since the derivative of a constant is zero, the point 
x = ξ at which f’(é) = 0 may be taken anywhere within the interval. 

If f(x) is not a constant function then m σέ M, and as f(a) = f(d) it 
follows that at least one of the numbers m, M must differ from the value 
f(a). We shall suppose that M + f(a). Then clearly the value M must be 
attained at some point x = ¢ interior to (a, δ). As fis assumed to be differen- 
tiable in (a, δ) it follows that Theorem 5-10 must be applicable showing that 
f'(é) = 0. A similar argument applies if m + f(a). Geometrically this theorem 
simply asserts that the graph of any function satisfying the conditions of the 
theorem must have at least one point in the interval [a,b] at which the 
tangent to the curve is horizontal. 

If f is not differentiable at evén one interior point of (a, δ) then Rolle’s 
theorem cannot be applied. Our counter-example in this instance is the 
simple function f(x) = |x| with —1< x <1. This function is everywhere 
continuous, and is differentiable at all points other than at the origin, but 
there is certainly no point x = € on [—1, 1] at which /’ = 0. The graph of 
this function is shown in Fig. 5-9, with one of a function g(x) not satisfying 
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(a) (b) 
Fig. 5-9 Counter examples for Rolle’s theorem: (a) Rolle’s theorem does not apply 
—no point & for which f’(&) = 0; (b) g (S$) = 0, but Rolle’s theorem does not apply. 


the conditions of the theorem but for which the result happens to be true. 


5:3(c) Mean value theorems for derivatives 


Our most important application of Rolle’s theorem will be in the proof of 
the mean value theorem for derivatives. In a first account of the subject it is 
difficult to indicate just how valuable and powerful this deceptively simple 
theorem really is as an analytical tool. However something of its utility will, 
perhaps, be appreciated after studying the remainder of this chapter. First 
let us present an intuitive approach to the theorem. 

Consider Fig. 5-10 which represents a graph of a differentiable function 
f(x) on the open interval (a, δ). Then as P and S are the points (a, f(a)) and 
(ὁ. f(b)), the gradient m of the line PS is 


_ fl) -f@ 
boa 


Now we may identify points Q and R, with respective x-coordinates € and ἢ 
interior to (a, δ), at which the tangent lines /; and /2 to the graph are parallel 
to PS, and so must also have the same gradient m. Then because of the 
geometrical interpretation of the derivative Κ΄ as the gradient of the tangent 
line, at either P or Q we may equate m and ζ΄. If we confine attention to point 
Q we have 


f(6) — f@) - (a) 


=f, 


where a - § < b. This is the form in which the mean value theorem for 
derivatives, also known as the Jaw of the mean, is usually quoted. In geo- 
metrical terms the theorem asserts that there is always a point (&, f(&)) on 
the graph of the function, with a < € < ὁ, at which the tangent to the curve Is 
parallel to the secant line PS. The fact that the precise value of & is not 
usually known is, generally speaking, unimportant in the application of this 
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Fig. 5.10 [llustration of the mean value theorem. 


theorem. This is because it is often used with some limiting argument in 
which ὁ — a, so that §->a also. A formal statement of the theorem is as 
follows. 


THEOREM 5-12 (mean value theorem for derivatives) If f(x) is a real valued 
function that is continuous in [a, b] and differentiable in (a, δ), then there 
exists a point ξ interior to (a, δ) such that 


{0-0 _ rp, 


The existence of more than one point & in (a, δ) at which this result is 
true is not precluded. This is so because it is only asserted that such a point 
exists, and not that there is necessarily only one such point. Such is the case, 
for example, in Fig. 5-10 since as was remarked, f’(€) = f’(y) with & - ἢ, 
though both points é and ἢ are interior to (a, δ). 

Many people would regard the argument above as proof enough of the 
mean value theorem, but for the more critical reader we now offer the 
promised proof based on Rolle’s theorem. 


Proof As with the proofs of many mathematical theorems, our result is 
established more easily by a somewhat artificial approach than by a direct 
method. Here we shall utilize the intuitively obtained result above to suggest 
the form of a special function F(x) to which Rolle’s theorem can be applied, 
thereby yielding the desired result. 
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Specifically, since by implication the result depends on f(x) and x, we 
shall try to find the simplest function F(x) that depends on f(x) and x, 
that is continuous in [a, 6] and is differentiable in (a, δ), and is such that 
F(a) = F(b). The value of F(a) may be assigned arbitrarily and F(x) will 
still satisfy Rolle’s theorem, so to simplify slightly the working we shall 
assume that F(a) = F(4) = 0. 

We consider the obvious function 


F(x) = A+ Bxt+ f(x) 


which clearly satisfies the continuity and differentiability conditions of 
Rolle’s theorem. The constants 4 and B must be chosen in order that 
F(a) = F(b) = 0. 


Thus 
0=A+ Ba + f(a) 
and 
= A+ Bb + f(b) 


from which it follows that, 


(f=f0), p_ f= fo 


ee a—b 


Hence F(x) has the form 


F(x) = f0) — fa) + aa Weal a Ge) 


Thus we have succeeded in finding a function F(x) with the desired properties 
which satisfies Rolle’s theorem. Differentiating F(x) we obtain 


Fo) - ΤῸ - (=), 


Now by Rolle’s theorem there exists a point ξ, with a - & < ῥ, such that 
F'(é) = 0 and so we have our desired result 


f(b) -- Oa se (a) 


oe =f. 


Since we may write ᾧ = a + θ( — a), where 0 < 0 < 1, this result is 
sometimes expressed in the following form attributable to Cauchy, 


f(b) — f(a) =(b— a f'[at+ (b—a)]  withO0<6<1. 


By applying the same arguments to a suitably constructed function 
y(x), analogous to F(x), it is a simple matter to prove the following extension 
of the mean value theorem due to Cauchy. (See Problem 5-37.) 
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Corollary 5:12 If g(x) =h'(x) at all points of [a, δ], then g(x) = h(x) 
+ constant in [a, δ]. 


Proof Set f=g —h in Theorem 5-12 applied to the interval [a, x]. Then 
g(x) — A(x) = g(a) — A(a) = constant and the result follows. 


THEOREM 5-13 (Cauchy extended mean value theorem) If f(x) and g(x) 
are real valued functions that are continuous in [a, δ] and differentiable in 
(a, b) and g’(x) £0 in (a, δ), then there exists a point ξ interior to (a, δ) 
such that 

OS) 2) 

g(b)— g(a) 8.(2) 
5:3 (4) Indeterminate forms—L’Hospital’s rule 
Limits such as lim (sin «x)/x which apparently tend to the form 0/0 have 

φτοῦ 


already been encountered and given meaning in special cases. A closely 
related problem is that of giving meaning to the limit of a quotient which 
apparently tends to 00/00. These limit problems are both called indeterminate 
forms. One of the most obvious applications of the extended mean value 
theorem is to resolve the value of the limit in either of these situations, and 
we now prove the simplest statement of a useful result generally known as 
L’ Hospital’s Rule. 


THEOREM 5-14 (first form of L’Hospital’s rule) If f(x) and g(x) are real 
valued differentiable functions at x = xo and, 


(a) Πάω = g(x0) = 0, 


(b) lim i = 4, where 1 is either a real number or infinity, 
ang & \X 


ie ic ee cae 

r—>X0 ro 5 g(x) a—xy9 ὃ g’(x) 
Proof Apply the extended mean value theorem to the functions f(x) and 
g(x) defined on the interval [x, xo] and use condition (a) to obtain 


ΠΕ, _f®© 


g(x) “ὦ 


where x < & < Xo. 
Now x —> xo implies that  — xo, so that by condition (b) we have the 
desired result 


f(x) ΠΗ f® _ 
a ee 


The fact that the variable appears in the second limit in place of the x 
stated in the theorem is unimportant. Its function is simply that of a variable 


εὐ" 
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and the symbol used to denote it is immaterial. 

In general, when the symbol used to denote a variable is unimportant 
because it only appears in some intermediate calculation, the details of which 
do not concern us, we shall call it a dummy variable. 

A useful extension of L’Hospital’s rule is contained in the following 
corollary which allows examination of limits which tend to the form 00/00. 


Corollary 5:14 If v(x) and w(x) are real valued differentiable functions at 
xX = Xo and, 


(a) lim g(x) + +00, lim p(x) > 40, 


t—-~XQ TMH 
y'(x) ἫΝ se 
(b) lim yO = 4, where A is either a real number or infinity, 
στοχο YX 
then 
tim ee = tim 2 = 


τοῖο vx) zx Y vy) 


Proof Apply the extended mean value theorem to the quotient g(x)/y(x) in 
the open interval (x, x1) with x9 < x < x1, and write the result in the form 


_ va) 
φίλ) vx) | φίῷ 
vx) 1. o)| WO 
g(x) 


where x < § < x1. Then, taking x; fixed and arbitrarily close to x9 so that 
§ —> xo, allow x — xo. The first factor on the right-hand side then approaches 
arbitrarily close to unity thereby giving rise to the stated result. A modifica- 
tion of this argument shows that the result is also true if x9 — ©. 


Example 5-9 Determine the value of the following indeterminate forms 
using L’Hospital’s rule and Corollary 5-14: 


sin sane 


(a) a 


xX 


x3 4 3x2 —2x -- 2. 


(b) lim 


(din sin se 
x3 


r—0 


tan 3x. 


5 


(4) lim 


α--τὴπ tan 
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, 


a 
(6) lim Η 


r cot Dx 


Solution (a) This is of the form lim f/g-+ 0/0 with f(x) = sin xv and 
g(x) = xv. As f(x) = x cos xx and g’(x) = | it follows that 
SIN ax Ι; %COS AY 


lim = lim = 2% 
τοῦ τ τοῦ 


This confirms the limit that was obtained by a different method in Chapter 3. 


(b) This is also of the form lim f/g > 0/0 but this time with f(x) = x3 
+ 3x? -- 2χ —2 and g(x) = 2.3 —x— 1. It follows that f(y) = 3x? 
+ 6x — 2 and g’(x) = 4x — | so that 


x3 + 3x2 — 2, —2 τ 3x2 + 6x — 2 _ 7 


ἡ 3 ἘΞ Ἐς ΞΞ - 5 πτ τ ὺς- κέ δά δῷ seer eee 
εὖ 2χ — x — | oe 4x — 1 3 


t—>] 

(c) This is again of the form lim f/g—0/0 with f(x) = sin 3v and 
g(x) = x. Hence f’(x) = 3 cos 3x and g’(x) = 3x? so that 

. gin 3x . cos 3x 

lim — = lim 

ray. oe r+) VO 

(4) This is of the form lim f/g --" o/oo with f(x) = tan 3x and g(x) 
= tan x. Hence f’(x) = 3 sec® 3x and g(x) = sec® x and by Corollary 5:14, 


ὍΣ Ἔα. 


_ tan 3x _ 3 sec? 3x _ cos? x 
lim = lim —~— = 3lim eee 
Gh AAO χορ στ psp ΘΟ ON 


This is again an indeterminate form, but now of the type 0/0. Applying 
Theorem 5:14 we have 


cos? x 2 sin x cos x ‘sin x ες ἢ cosx 
3 lim = 3lim τς τ" τ lim (= lim ( 
ron COS? 3x pate OSIN SX COSSN yeni θη ONY p-e \COS 3X 
and hence 
tan 3x cos x 
im = — lim . 
z-}r tan x postin COS: ON 


This last result is yet again an indeterminate form of the type 0/0 so that a 
further application of Theorem 5-14 finally gives 


tan 3x Ι; sin x ] 
= i 


— --, 


m : —= 
rho tan Xx wt—>\|ir 3 sin 3x 3 


(6) This is of the form lim f/g — oc/oc but it 1s easily seen that an applica- 
tion of Corollary 5-14 will not simplify the limit to be evaluated. Instead, we 
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rewrite the limit in the form 


[ a 

xe tan bx 
lim = lima 
yg COL DX yo) Vv 


when it is seen that the alternative form is of the type lim f//g — 0/0 with 
f(x) = a tan bx and g(x) = x. Now f’(x) = ab sec? bx and σ΄ (1) = I so that 
by Theorem 5-14, 


=) 
x ab sec? x 


lim — =< lim ———_—-_ =a 
χορ Cot bx r—) [ 


5:3 (6) Identification of extrema 


We return to the topic of extrema and, in particular, to the identification of 
functional behaviour at stationary values by means of the mean value 
theorem. 

Suppose that a real valued function f(x) ts differentiable in the interval 
(a, b) and has a maximum at an interior point xo of (a, ὁ). 

Then if A is assumed to be positive and we consider the interval 
[xo — ἢ, Xo] to the left of xo, by the mean value theorem 

flxo) fixe). rey, 
where x9 - ἢ < & < Xo. 

Now by supposition ἢ > 0 and as xo 15 a maximum, the numerator of 
this expression will also be positive showing that f’(é) > 0. Hence by allowing 
ἢ to tend to zero, it follows that ᾧ --κ x9 and we have shown that to the immedi- 
ate left of the maximum we must have f’ > 0. 

To the right of the maximum, and in the interval [xo, xo + ἡ], the same 
argument shows that 


I (xo + h) — f (x0) 


i = f'(y), 


where Xo < ἢ < Xo + A. This numerator is negative so that to the immediate 
right of the maximum we must have f’ < 0. 

Similar arguments applied to a minimum and a point of inflection with 
a horizontal tangent yield the following useful theorem, illustrated in Fig. 5-11. 


THEOREM 5-15 (identification of extrema using first derivative) If f(x) is a 
real valued differentiable function in the neighbourhood of a point xo at 
which f’(xo) = Ο then: 


(a) the function has a maximum at Xo if f(x) > 0 to the left of xo and 
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Fig. 5:11 Stationary values of y = f(x): (a) local maximum; (b) local minimum; 
(c) point of inflection with zero gradient. 


f(x) < 0 to the right of x9; 

(b) the function has a minimum at xo if f’(x) < 0 to the left of vo and 
f(x) > 0 to the right of xo; 

(c) the function has a point of inflection with zero gradient at xo if 
(ΑἹ has the same sign to the left and right of xo. 


In many books these results are regarded as intuitively obvious deductions 
from the geometrical interpretation of a derivative in conjunction with the 
behaviour of the graph of the function. However we have discussed them 
formally here as an illustration of an important consequence of the mean 
value theorem. 


5 
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Example 5-10 We again consider the functions of Example 5.8. 


Case (a) f(x) = $x? + 2x? + 3x +1 with stationary points x =é at 
ᾧ τὸ» —land & = -- 3. As f'(x) = x? + 4x + 3 it follows that to the immedi- 


ate left of ἢ = —1 we have Κ΄ < 0, whilst to the immediate right f’ > 0 
showing that € = —1 corresponds to a minimum. A similar argument shows 
that € = —3 corresponds to a maximum. 


Case (Ὁ) f(x) = (2x — 5)x?/3 with the one stationary point x = & at 
E=l. As f'(x) = 2x23 4 2(2x — 5)/3x'/3 it follows that f’ <0 to the 
immediate left of = 1 and f’ > 0 to the immediate right. Hence ἔ = | 
corresponds to a minimum. As Theorem 5-15 stands, since fis not differenti- 
able at the origin, the maximum that occurs there must be identified as in 
Example 5-8. However a trivial modification of the proof would show that 
results (a) and (b) of the theorem are still valid if fis not differentiable at xo. 


5:3 (f) Differentials 


In using the notation dy/dx to represent the derivative of the dependent 
variable y with respect to x we have thus far been careful to emphasize that 
dy/dx is simply a number defined by a limit. Although suggestive of incre- 
ments, dy and dx taken separately have as yet no individual meaning. In 
many applications, particularly in differential equations which we encounter 
later, it is convenient to work with actual quantities dy and dx which we will 
call differentials. 

However differentials must obviously be defined in a manner consistent 
with the notation dy/dx when it is used to denote the derivative with respect 
to x of the function } defined by 


vy =f). (5-8) 
We achieve this by defining dy, the first-order differential of ), by 
dy = f(x). Ax, (5-9) 


where Ax is an increment in x of arbitrary size. 

However, if, for the moment, we regard the independent variable x as a 
function of x we can write x = g(x) with g(x) = x. Then by the above 
argument dx, the first-order differential of x, is defined by 


6x1 οὐλὰς | (5-10) 
showing that we may with meaning write Eqn (5-9) in the form 
dy = f'(x)dx. (5-11) 


When needed, the actual increment in }" consequent upon an increment 
Ax in x will be denoted by Ay. In general the differential dy and the increment 
Ay are distinct quantities and the interrelationship between them is indicated 
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ΕἸΡ. 512 Differentials dx and dy. 


in Fig. 5-12. 

In more advanced treatments the use of differentials is strictly avoided on 
account of logical difficulties encountered with their definition. However 
they are so useful that we shall ignore these objections and use them ΠΈΡΙ 
whenever necessary. 

It is an immediate consequence of this that if 


y= kif(x) + keg (x) 
then by Theorem 5-4, 
dy = ki f'(x)dx + keg'(x)dx 
or, equivalently, in symbolic notation 
d(kif + keg) = kidf + kodg. (5-12) 
If we have 
vy = f(x)g(x) 
then by Theorem 5:5, 
dy = g(x) f’Qx)dx + f(x)g'(x)dx 
or, equivalently, in symbolic notation 
d(f/g) = gdf + fdg. (5-13) 
Finally, if 
» = fg) 
then by Theorem 5:8, 


dy = 
᾿ 5 Ὁ) 
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or, equivalently, in symbolic notation 


41} Ξ ΕΓ δε, (5-14) 


Example 5:11 If f(x) = sin (x2 + 4) and g(x) = x° find the differentials: 
(a) d3f+ g); 
(0) d(fg); 


oa} 


Solution 
(a) d(3f+ g) = d[3 sin (x? + 4) + x3] 
= 3cos (x? + 4)d(x? + 4) + 3x2dx 
= 6x cos (x2 + 4)dx + 3x2dx. 
(b) d(fg) = d[x8 sin (x? + 4) 
= 3x2 sin (x? + 4)dx + x? cos (x2 + 4)d(x? + 4) 
= 3x? sin (x? + 4)dx + 2x4 cos (x? + 4)d~. 


9 af) = of 29) 


x3 


_ x3 cos (x? + 4)d(x? + 4) — 3x? sin (x? + 4)dx 
Ξ ---------  ---..-----Ὀ-.- 


_ Ζ2χϑ cos (x? + 4)dx -- 3 sin (x? + 4)dx 
— a a πτὺ  " a! 


For small values of dx, the differential dy is obviously a reasonable 
approximation to the actual increment Ar. This simple observation ts often 
utilized to relate small changes in dependent and independent variables as 
the next example shows. 


Example 5:12 The pressure p of a polytropic gas is related to the density p 
by the expression 


Pp = Ap’, 
where A is a constant. Deduce the relationship connecting the differentials 
dp and dp. Given that y = 3/2 and p = 4, and taking dp as an approximation 
to the actual pressure change Ap, compute the approximate new pressure if 
p is increased by 0-1. Compare the approximate and exact results. 
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Solution In this case p = f(p) with f(p) = Ap’. Hence f’(p) = yAp?-! and 
thus the desired differential relation is 


dp = yAp*'dp. 


When y = 3/2 and p = 4 it follows from the stated pressure—density 
law that the initial pressure po is 


po = 4324 = 84. 


Using the differential relation to compute the approximate pressure increase 
represented by the differential dp we find 


dp = (3/2). 4.412. (0-1) = 0-34. 


Hence the approximate new pressure po + dp = 8:34. 
The exact new pressure po + Ap may be computed from the pressure- 
density law by setting p = 4-1 to obtain 


po + Ap = (4:1)324 = 8-308A. 


This shows that in this case the differential relation gives a good approxima- 
tion to the pressure increase. 


5.4 Higher derivatives—applications 


We have seen how differentiation applied to a suitable function f(x) yields as 
a result another function f’(x), the derivative of f(x) with respect to x. If 
the function f’(x) is itself differentiable then a repetition of differentiation 
will result in a further function that we shall denote by f’(x) and will call the 
second derivative of f(x) with respect to x. We may usefully employ the 
dynamical problem that served to introduce the notion of a derivative to 
give meaning to the notion of a second derivative, for if f’(x) represents a 
velocity, then f(x) represents an acceleration. If the function f’(x) is 
itself differentiable then it is customary to denote the third derivative of f(x) 
_by f(x) after which, if necessary, further derivatives are conventionally 
denoted by the use of superscript roman numerals. Hence the sixth derivative 
of a suitably differentiable function f(x) would be written fVi(x). 

A better notation than this is needed for general purposes and the two 
most often used because of their versatility are 

d”y 

dx” 
These both represent the nth derivative with respect to x of y = f(x) and 
for their determination require the successive application of differentiation 
n times. The number ἡ is the order of the derivative and the symbol ἢ 
symbolizes the operation of differentiation. Computationally the definition 
of the-nth derivative of y with respect to x is equivalent to using either of 
these two equivalent algorithms 


or Dy". 
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—l n 
role sof DID™ty] = Dry, (5-15) 


These expressions are, of course, only meaningful when ἡ is an integer 
and we shall agree to the convention D®°y = y. 

Geometrically, the function d”y/dx" bears to the graph of d®-ly/dx"-1, 
the same relationship as does the function dy/dx to the graph of y. Namely 
d"y/dx" at x = xo is the gradient of the graph of d”-ly/dx"~! as a function 
of x at the same point x = Xo. 


Example 5:13 Determine dy/dx, d?y/dx?, and d?y/dx? given that y = f(x) 
with: 

(a) f(x) = cos mx; 

(b) f(x) = tan x; 

(c) f(x) = 1 + 5). 


If possible make deductions about the nth derivative. 


Solution 
(a Ὲ πε = =f") = & (cos mx) = —msin mx, 
bs d /d d 
ΤΣ me (2) eee [—m sin mx] = —m?* cos mx, 
43 d (ἀξ d 
τ Ξ dx (33) ae [—m? cos mx] = m3 sin mx. 


An inductive argument easily shows that the nth derivative (d"/dx”)(cos mx) 
= m" cos [mx + (n7/2)]. 

In respect of the function y = cos mx, it is of importance to notice that 
the simple algebraic equation 


connects the function and its second derivative. Because this equation 
involves derivatives it is a differential equation. Such equations are very 
important in both mathematics and the mathematical sciences; the last three 
chapters of this book provide an introductory study of them. 


ee πον ον δος 
(b) rie f= re (tan x) = sec? x, 


ἀν d (2. 
dx 


d 
axe ᾿ )-- =, (sec x)2= sec? x tan x, 
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ἀν od (- 'y 


d 
dx ἢν 72) τ sec* x tan x) = 2 sec? x(2 tan? x + sec? x) 


There is no simple rule by which (d”/dx”)(tan x) may be computed. 
] —1 
Ὁ αὶ SO" 5 (τὶ ταταρ 
ἀν -5(2)-slaul- 2 
ἀχξ dx \dx/ dx L(+ χ᾽ (1+ x)? 


fy 4 (2) - 4 2 1- . 
dx3 ἀχιάχ) dx Li + x3] (1- x) 


It follows by induction that 
dn ( Ι Ὁ (=1)n! 


In general, functions are not capable of differentiation an indefinite 
number of times, and at some stage they usually become non-differentiable. 
A simple example of a function that is not differentiable an indefinite number 
of times, though for a different reason from the above, is x”, with # an integer. 
The nth derivative of x” is the constant number 7! so that the (n + 1)th and 
all subsequent derivatives are identically zero. 


5-4 (a) Leibnitz’s theorem 
This useful theorem is a consequence of Theorem 5:5 and facilitates the com- 
putation of high-order derivatives of the product f(x)g(x) of the two func- 
tions f(x) and g(x), in terms of the derivatives of the individual functions 
f(x) and g(x) themselves. 

The result is, perhaps, best expressed in terms of the symbolic differentia- 
tion operator D, and fof our starting point we now re-express the result of 
Theorem 5-5 in terms of the operator ἢ. 


D( fg) = fDg + gDf. 


Assuming functions f(x) and g(x) are suitably differentiable, a further applica- 
tion of the operator D together with Theorem 5:5 yields 


D*( fg) = D(fDg + gDf) 
= Df. Dg + fD*g + Dg. Df + gD°f. 


However 


_ of dg _ dg df _ 
δον λε΄. dx dx dx = Dg. Df. 
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so that 
D*( fg) = fD°g + 2Df. Dg + gD*f. (5:16) 
A repetition of the same argument shows that 
D3( fg) = fD®g + 3Df. δὲν + 303Γ. Dg + gD°*f. (5-17) 


The coefficients involved in Eqns (5-16) and (5-17) are seen to belong to 
the general pattern of binomial coefficients in the expansion of (a + ὁ)", 
namely to the rows of numbers 


oa 
= 
I 

eth 

— 


— 
= 
Hl 
ΒΩ 
— 
. π-πτὶ᾿͵ a a 


or, equivalently, to the rows 


(n = 0) Ι 

(n = 1) Ϊ 

(n = 2) | Ι 

(n = 3) ] 3 3 | 


This suggests that in evaluating D"(fg), the coefficients arising should 
belong to the (# + 1)th row of either of these arrays, which are Pascal 
triangles. That this is so can be proved fairly easily, using an inductive argu- 
ment similar to that used to prove the binomial theorem. We shall not give 
the details, preferring simply to state the theorem. 


THEOREM 5:16 (Leibnitz’s theorem) If f(x) and g(x) aren times differentiable 
real valued functions in the interval (a, δ), then 


D*(fg) τὸ Σ (1) Dery. Dig. 


=( 
The value and power of this is best shown by an application. 


Example 5:14 Use Leibnitz’s theorem to evaluate (d3/dx3)(x§ sin x). 
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Solution Setting n = 3 in the general result gives 
D3( fg) = gD?f + 305. Dg + 3Df. D?g + fD>z. 


This is, of course, result (5-17) differently expressed. Now we make the 
identifications f(x) = x® and g(x) = sin x when it follows that Df = 6x5, 
δε = 30x4, D8f = 120x3, and Dg = cos x, D?g = — sin x, D?g = — cos x. 
Hence substitution into the above result gives 


D8(x6 sin x) = 120x3 sin x + 90x4 cos x — 18x sin x — x6 cos x. 


5:4 (Ὁ) Identification of extrema by second derivatives 


An important application of the second derivative of a function f(x) is to the 
identification of the nature of its extrema. Let us suppose that f(x) is twice 
differentiable and that f’(xo) = 0 and f’(x0) = L < 0. 

Then from Definition 5:2 and the notion of a second derivative we must 


have that 
f° (xo) = lim fen fee =f i = 


E279 


L<0. 
By supposition f’(xo) = 0, so that 


f’'(xo) = lim Το 


ζ- τὸ xX — X0 


=L< 0. 


This limit must be independent of the manner in which x approaches xo 
so that we must consider separately the cases that x lies to the left or to the 
right of xo. 

If x lies to the left of xo then x — xo < 0. Consequently, as the value L 
of the limit is negative, the expression defining f"(vo) implies that to the 
immediate left of xo it must be true that f’(x) > 0. 

If x lies to the right of xo then x — xo > 0. Consequently, as the value Z 
of the limit is negative, the expression defining f”(xo) implies that to the 
immediate right of xo it must be true that Κ΄ (ΑἹ) < 0. 

These results, in conjunction with Theorem 5:15 (a) prove that at a 
stationary value xo, for which f’(xo) < 0, the function f(x) attains a maximum 
value. An exactly similar argument proves that at a stationary value xo, for 
which f”(xo) > 0, the function f(x) attains a minimum value. 

To complete the argument, consider the situation in which f”(xo) = 0. 
It might be conjectured that this corresponds to a point of inflection; and to 
establish the correctness of our intuition let us appeal to the geometrical 
interpretation of a derivative as a gradient. 

Suppose that xo corresponds to a point of inflection with zero gradient. 
Then as x increases through the value Xo, either 


(a) f’(x) Is initially positive and decreases to a minimum value Κ΄ (Ὁ) = 0, 
thereafter increasing again (cf. Fig. 5-11 (c)); 
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or, 
(b) f’(x) is initially negative and increases to ἃ maximum value f‘(xo) = 9, 
thereafter decreasing again. 


In each case xo is a stationary value of the first derivative f’(.x), so that by 
an application of Theorem 5-10 to the function f’(x) we find that f’(vo) = 0 
at a point of inflection. 

We have thus proved the following theorem. 


THEOREM 5:17 (identification of extrema using second derivatives) Let 
J (x) be a real valued twice differentiable function in (a, δ) with a stationary 
point xo In (a, δ), so that f’(xo) = 0. Then, if 


(a) f’(xo) < 0 the function f(x) has a maximum at xo, 

(b) f’(xo) > 0 the function f(x) has a minimum at Xo, 

(c) f"(xo) = ὁ the function f(x) has a point of inflection at vo with zero 
gradient provided that the sign of f(x) 1s the same to the immediate 
left and right of xo. 


The proof of this theorem shows clearly what was asserted earlier; namely 
that a point of inflection on the graph of a function separates a region of 
convexity from a region of concavity. There is, of course, no necessity that 
this point should have associated with it a zero gradient. 

Following this argument to its logical conclusion we see that the proof of 
(c) above need only involve the sign of f’(x) to the left and right of x9 when 
7΄ (Χο) = 0, for then such arguments are needed to distinguish between an 
extremum and a point of inflection. If f’(xo) 4 Ὁ such problems do not arise 
and it is sufficient to look for those values ¢ for which f’(&) = 0. We have 
thus proved the following general result. 


THEOREM 5:18 (location of points of inflection) If f(x) is a real valued 
twice differentiable function then its points of inflection, if any, occur at the 
numbers ᾧ for which f’(é) = 0 provided that f’(é) 4 0. If however this is 
not so, and f’(&) = 0, then é corresponds to a point of inflection provided 
that the sign of f’(x) is the same to the immediate left and right of é. 


It is left to the reader as an exercise to prove that when f’(xo) = f’(xo) = 0, 
then provided f”(xo0) exists, our condition on f’(x) may be replaced by the 
requirement f”(xo) 4 0. The proof is essentially similar to that given for 
Theorem 5-17 though this time the starting point is the definition of f”’(xo) 
expressed as a limit. We give this result as a corollary. 


Corollary 5:18 If f(x) ts a real valued thrice differentiable function and 
I'® =f" = 0, then f(x) has a point of inflection at x = ξ if f’"() £0. 


Example 5:15 Locate and identify the stationary values of the following 
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functions. Find any points of inflection they may have, together with the 
gradient of the tangent line at such points: 


(a) f(x) = x8 — 12x + 1 in [~10, 10]: 
(b) f(x) = tan x in [—4n, I]; 
(c) f(x) = (x — 1)8 in (— οὐ, οὐ). 


Solution (a) The stationary values are those numbers ξ for which f’(é) = 0. 
Hence as f(x) = 3x® — 12, the stationary values are determined by the 
equation 

3£2 — 12 = 0. 


This has roots € = 2, ἢ = —2 which both lie in [— 10, 10] and are the desired 
stationary values. As f”(x) = 6x, it follows that f’(2) = 12 > 0 and f’(—2) 
= —12 < 0. Hence by Theorem 5-17, the point € = 2 is a minimum and the 
point ὃ = —2 is a maximum. Since the function has no other stationary 
value there can be no point of inflection at which the tangent line has zero 
gradient. However f’(x) = 6x vanishes when x = 0, so that by Theorem 
5-18 we see that x = 0 must correspond to a point of inflection. The gradient 
at x = 01s Κ(0) = —12 which is the gradient of the desired tangent line to 
the graph at the point of inflection. 

(b) Here we have f’(x) = sec? x and clearly, since sec? x = 1 + tan? x, 
it follows that f’(x) τῇ 0 in [—47, ζπ]. The function f(x) = tan x thus has no 
Stationary values in [—j7, }7], though it assumes its greatest value at ἀπ 
and its least value at --- ἔπ. We have f”(x) = 2 sec? x tan x which vanishes 
for x = 0. Hence by Theorem 5-18, the function tan x has a point of inflection 
at the origin at which the gradient of the tangent to the graph has the value 
{(0) = 1. 

(c) We see that f’(x) = 3(x — 1)? and so the condition f’(£) = 0 yields 
ξ = | as the single stationary value. However, f’(x) = 6(ὑ — 1) which shows 
that we also have f’(1) = 0. Appealing to the last part of Theorem 5-18 we 
see that, as f(x) = 3(x — 1)? > 0 to both the left and right of x = |, it 
follows that f(x) = (x — 1)? hasa point of inflection at that point. The tangent 
line to the graph there has a zero gradient. Alternatively, as f"(x) = 6 4 0, 
the result also follows from Corollary 5-18. 


5-5 Partial differentiation 


The notion of continuity has already been extended so that it is meaningful 
in the context of functions of several independent variables. It is now appro- 
priate to extend the notion of a derivative in a similar fashion. For simplicity 
of argument we shall work with the function f(x, 1) of two independent 
variables, and in order to visualize its behaviour geometrically we will define 
a dependent variable by the equation 


u = f(x, γ). (5-18) 
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The function may then be represented as a surface in three dimensional 
space. 


A typical surface generated by a function of the form of Eqn (5:18) is 
shown in Fig. 5-13 and, unlike functions of one independent variable, it 15 
necessary to define more than one first-order derivative, The idea involved is 
simple: by holding one of the independent variables in f constant at some 
value of interest, the function fthen becomes a function of the single remain- 
ing independent variable. We may then differentiate f as though it were a 
function only of that one variable. By holding first x and then y constant in 
this manner, two different derivatives may be defined which, because of their 
manner of computation, will be called partial derivatives to distinguish them 
from our earlier use of the term derivative. We shall now express these ideas 
formally as a definition and set down the standard notation to be used. 


Fig. 5:13 Geometrical interpretation of partial derivatives. 


DEFINITION 5:5 (partial derivatives) Let f(x, ) be a function defined near 
(xo, Vo). Suppose that 


lim T(x, Yo) — f (Xo, Vo) 


%— IQ X — X90 


(A) 


exists and is independent of the direction of approach of x to x9. Then f is 
differentiable partially with respect to x at (xo, yo). The value of the limit is 
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denoted by fx(xo, yo) or by ΦΧ) 19,9) and called the first-order partial 
derivative of f with respect to x at (xo, yo). 


Similarly, suppose that 


lim I (x0, γ) — f (xo, yo) 
y~Yo ye YO 


(B) 


exists and is independent of the direction of approach of y to yo. Then Κ is 
differentiable partially with respect to y at (xo, yo). The limit is denoted by 
SulXo, yo) or by 2f/2y| (xg) and called the first-order partial derivative of f 
with respect to y at (xo, yo). 


By analogy with ordinary derivatives, if f(x, y) is differentiable partially 
with respect to x and ν at all points of some region in the (x, y)-plane and 
these derivatives are continuous, then we say fis differentiable in that region. 
The operations of partial differentiation with respect to x and y are usually 
denoted by the differentiation operators 0/¢x and @/@y, respectively. 

Let us now interpret these definitions in terms of Fig. 5-13. The function 
f(x, yo) occurring in the numerator of limit (A) in Definition 5-5 is represented 
in that figure by the intersection of the surface ἡ = f(x, y) with the plane 
y = yo which has been labelled Ij. It is the curve L}. The number fz(xo, yo) 
defined by limit (A) is the gradient of the tangent line /; to this curve at point 
P. By requiring the limit to be independent of the direction of approach of x 
to xo, we have ensured that the tangent lines drawn to the curve at P, whether 
from the. left or the right, will have the same gradient. In simpler terms this 
ensures that the curve 1 is smooth and has no kink at P. 

The number f(xo, y) occurring in the numerator of limit (B) in the defini- 
tion is represented in Fig. 5-13 by the intersection of the surface u = f(x, y) | 
with the plane x = x9 which has been labelled IIe. It is the curve Le. The 
number f,(xo, yo) defined by limit (B) is the gradient of the tangent line /s 
to this curve at point P. 

Thus by differentiating partially we mean that, during the process of 
differentiation, the other independent variable is to be regarded as a constant. 
In consequence, all the rules of differentiation developed for functions of a 
single variable are also rules of partial differentiation, provided only that the 
functions involved are suitably differentiable. On account of this when, for 
example, the operator 0/éx acts ona function only of y, say g(y), that function 
is to be regarded as a constant with respect to this operator and so 


(0/ex)(g(y)] = 0. Similarly (2/éy)[A(x)] = 0. 


Example 5-16 In each of the following cases compute f; and fy as functions of 
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x and y. Use the result to determine the numerical value of these derivatives 
at the stated points: 


(a) f(x, y) = x8 + 2xy + 2ν; (,,2); 


(b) f(x, y) = x sin xy 4+ 3; (1, 47); 
(c) f(x, y) = x/(? + y?); (1, 0). 
Solution 


0 
(4) fr = ὃ. [x8 + 2xy + 2y?] 


Ο C 2 
ΞΞΞ 3 ie 1. 22 — 
b+ yo ΜῊ 1 2? 2 [] 


ax 
whence 
of 
— = 3x2 + 2γ. 
Ox ey 


At the point (1, 2) we find that of/ex|/,.) = 7. Similarly, 


a) 
fa = 5 [x8 + Day + 2)» 


o o ra) 
— od ; Ὡς 2 
= S(t 2x G1 +25 ἢ 


whence 


of 


At the point (1, 2) we find that @f/éy|, ») = 10. 


(b) fe =< brsin xy Ὁ 3 


δ 0 δ 
= X ae [sin xy] + sin Ἐν [x] + δι [3] 


whence 


of _ Sa 
By XY 00S xy + sin xy. 
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At the point (1, ἐπ) we find that @//@x|,, ,,) = 1. Similarly, 


fy = — [sin xy +3 
προ 


_ ὃ 3 
yy ἜΧΗ ἘΞ ΒΙ 


whence 


of 
ay 


(1,27) 


OF aR 


“ΜΙ tx Se? + 


x2 γὲ y? ax 
1 x δ 
ἜΣ ἘΣ ΣΝ ὙΠ ἡ ΠΤ. 
x2 Ἐν (x? + y?)? ax bey 
whence 
of Ι aXe y? — x? 


eS ree eee LT ———_—_—  . 


Ox x? + y? (x? + y?)? ~ (x2 + y?)? 
At the point (1, 0) we find that éf/éx|,, 9 = —1. Similarly, 


4 x 
ty = ay ea i 


= -ξ [(x? -+- 5} Ὁ 


--α 0 χϑ ; 
πὴ TT 

whence 

Of --ο2χν 

ὃν (x2 + y2)? 
and so 

ὃ 

1 πο 


CH 
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The notion of partial differentiation extends to functions of more than two 
independent variables in an obvious manner. Suppose that the function 
f(x, y, 2) is defined near the point (xo, yo, Zo) then, provided the limits exist, 
we define the three first-order partial derivatives fz, fy, and fz by the expressions 


of ten L(% Yo, 20) — fo, Yor 20) 
ox (%0,2/0,20) rx xX — XO 
a = Jim 705.» 20) — So, Yor 20) 
ey (Zo,Yo,20)  Y-Yo Y— yo 
of — fim L202) -- fo, Yor 20) 
oz (Z0,¥0,20) z—>Z0 Z— Zo 


Clearly a function of n independent variables will have n different first- 
order partial derivatives ; one with respect to each of the independent variables. 
The actual computation of these partial derivatives is carried out exactly as 
before. 


Example 5-17 Find the first-order partial derivatives of 
S(% y, 2) = x8 y? + 3 sin yz + 2. 


Solution This function has three independent variables so we must obtain 
three first-order partial derivatives. Namely, fz, fy, and fz. First we have 


of ὃ | 
“Ὃς. -- [χϑγ3 
re By αν + 3sin yz + 2] 
Ο 6 7) 
Bee eens : ee 
Vee sen ho ΞΡ 2 ΠῚ oie oe els 
SO 
of 
Dx 3x*y2, 
Next, 
of ὃ 
et  Ἰχϑυ 
ὃν By bey + 3 sin yz + 2] 
δ. oO 0 
= χὰ ἢν» — {sj ie 
x ὃν [» + 3 By [sin yz] + By [2], 
50 
of 
— = 3 
By 2x ἣν + 32 oS yz. 
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Finally, 


ὃ ὃ 
Ἵν Ὁ μκορρ + 3sinyz + 2] 


= x3y2 ba [1] + 3 [sin yz] + = [2] 
ὃΖ dz On 


SO 


of 


as = 3) cos yz. 


5-6 Total differential 


The idea of a differential, that was useful in ordinary differentiation, may 
also be developed to advantage in connection with partial differentiation. 
We first approach this problem from the geometrical standpoint, and then 
indicate how an analytical counterpart of these arguments can be produced. 

Let us consider Egn (5-18) and its geometrical representation in Fig. 5-13. 
The conditions for differentiability at P ensure that the surface has a tangent 
plane II at that point (why?), and it is to this plane that we now confine 
our attention. An element of this tangent plane defined by the lines ἢ and /2 
through P is depicted in Fig. 5-14. Obviously points on II close to P must also 
be close to those points on the surface u = f(x, y) that lie vertically below 
them. This suggests that for such points, the element of plane IT neighbouring 
P represents a good approximation to the element of the curved surface 
defining the function u near to P. Thus variations of u close to P may, with 
propriety, be approximated by the variations of the corresponding points on IT. 

Since we are interested in variations of uw about the point P at which 
uo = f(xo, yo), we shall start by translating our coordinate axes without 
rotation to the point P. In this position the new x, y, and u coordinate axes 
will be denoted by x’, γ΄, and μ΄, respectively, as shown in Fig. 5-15. 

If, relative to P, the x’ and y’ coordinates of a point P’ are Ax and Ay, 
then it is obvious from Fig. 5-15 that the increment du must be 


du = Ax tan α + Ay tan 8, 


where « and β are the angles between the lines /; and /2 and the x’- and y’-axes, 


respectively. 
However, by the definition of f and fy, we have 


Flo, yo) = tan a, Fulxo, yo) = tan β, 
so that 
du = fz(xo, yo)Ax + fy(Xo, yo)Ay. (5-19) 


We now define differentials dx and dy in the independent variables x and y 
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Fig. 5.14 Tangent plane II to surface u = f(x, y) at point P. 
by setting dx = Ax and dy = Ay. Expression (5-19) then becomes 


du = fr(Xo, yo)dx + fy(xo, yo)dy, (5-20) 


which is the relationship by which we define the total differential du of the 
function u = f(x, y). This is so called because it takes account of the total 
effect, on u, of the changes dx in x and dy in y. The additive effect of these 
changes is clearly apparent in Fig. 5-15 and results from using a tangent plane 
approximation to the surface near P. As before, when dx and dy are suitably 
small, du is a reasonable approximation to the true change Aw given by 


Au = f(xo + dx, yo + dy) — f(xo, yo). | (5-21) 

An analytic rather than geometric justification of the tangent plane 
approximation used to define du in Eqn (5-20) can be based on Theorem 5-12. 

Equation (5-21), which is exact, is taken to be the starting point and by 
addition and subtraction of a term f(xo, yo + Ay), is written 


Au = [f(xo + Ax, yo + Ay) — f(xo, yo + Δ᾽) 


+ [f(xo, yo + Ay) — f(xo, yo], 
_ where the first bracket is a function only of x and the second bracket is a 
function only of y. 

Then Theorem 5-12 expressed in the Cauchy form may be applied to the 
first bracket with respect to x and to the second bracket with respect to y to 
yield 
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Axtan a+Aytan β 


Fig. 5.15 Element of tangent plane. 


Au = Axfr(xo + Ax, yo + Ay) + Avfy(xo, yo + nAy), (5:22) 


where 0 < €< 1 and 0< 7 < 1. Partial derivatives have been used here 
_because, although in the first bracket it is only x that varies whilst in the second 
bracket it is only y that varies, both brackets are nevertheless functions of 
x and y. 

Result (5-20) then follows by letting Ax and Ay become small. The 
continuity of fi(xo + Ax, yo + Ay) allows it to be approximated by 
Jc(Xo, Yo) with an error εἰ and, similarly, the continuity of fy(xo, yo + nAy) 
allows it to be approximated by /f,(xo, yo) with an error δ. Then, as Ax, 
Ay — 0, so also do εἰ and éz. It is left as an exercise for the reader to supply 
the details necessary to make this argument rigorous. If Eqn (5-20) is defined 
for all points (xo, yo) of some region in the (x, y)-plane, theh the suffix zero 
may be discarded and Eqn (5-20) can then be regarded as a functional rela- 
tionship rather than a result that is true only near one point. 

We have thus proved a special case of the following more general result 
whose proof differs in no significant detail. 


THEOREM 5:19 (total differential) Let f(x1, x2,. . .,xn) be a real valued 
function of 7 real variables and let its first-order partial derivatives exist and 
be continuous in some region &. Then the total differential du of the function 
u = f(x1, X2,.. ., Xn) in the region & is given by 


ὃ ὃ ὃ 
Ry ae  ἧς +:-; sip Oia. 
OX] Οχο OXn 
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If we consider the surface generated by setting wu = constant, then on that 
surface du = 0. Theorem 5-19 then takes the form 


CG 
0 = τ dx1 + “ dygadate oes a, (5-23) 


showing that the differentials dx1, dxz,. . ., ἄχῃ are no longer independent 
since this constraint condition has been imposed on them. This is of course 
to be expected, since we have imposed the single condition f(x1, x2, . . ., Xn) 
= constant on the independent variables μι, ue, . . ., uz so that we are no 
longer free to change them arbitrarily. Indeed, if differentials dx, dx2,. . ., 
dxn—1 are chosen arbitrarily, then the remaining differential dx, is uniquely 
determined by Eqn (5-23). If we call the number of independent variables the 
number of degrees of freedom associated with the equation u = f(x1, χα, ..., Xn), 
then Eqn (5-23) implies the loss of a single degree of freedom. 


Example 5-18 In thermodynamics, the pressure p of an ideal gas, its volume 
V, its absolute temperature Τ᾽ and the gas constant R are related by the ideal 
gas law pV = RT. Find the expression relating the total differential dp and 
the differentials dV and dT. 


Solution We have p = RT/V, and so p = f(T, V) with f(T, V) = RT/V. 
Hence ΠΤ = R/V and ee — RT/V?. Now interpreting Theorem 5-19 
in this case we find 


ap = (F)ar+ (Z)ar, (*) 


and so 


d Sar av. 
p= y2 


Notice that the use of the symbol fin the total differential relation (*) to 
bring it into accord with the notation of Theorem 5-19 is not strictly necessary 
since p = f. We could equally well have written equation (*) as 


ap = (4) ar + (2) ar, 
and used the immediately obvious result that 


op Καὶ Op _ RT 


Let us now consider the function u = f(x, y) and, as a special case, set 
u = 0 so that the equation 
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I(x, y) = 0 


defines y implicitly in terms of x. How then may we compute the derivative 
dy/dx without solving for y in terms of x? The solution to this problem is 
provided by Eqn (5-23), which in this case takes the form 


_ of of 
~ Ox ey δὴ 


We saw in connection with the definition of the differentials dy and dx in 
Egn (5:11), that the function (dy/dx), called the derivative of y with respect 
to x, is the ratio dy : dx of the differentials. Hence dividing by the differential 
dx, assuming that δον ¢ 0, and rearranging gives the result 


dy _ —(f]@x) 
dx (Οῇθν) 


We state this as a corollary to Theorem 5:19. 


dx + 


Corollary 5:19 (a) If the real variables x and y are related implicitly by the 
equation f(x, y) = 0, and the partial derivatives of/0x and of/éy exist and 
are continuous, then 


£-- HD 


whenever Of/éy + 0. Insistence on this latter condition may be avoided by 
writing the result in the alternative form 


(=) dy te Of _ 
oy} ax ax 


The situation ts slightly different if three variables x, y, z are involved and 
z, say, is defined implicitly in terms of the independent variables x and y by 
the equation 


S(x,y, 2) = 0. 
In these circumstances it is frequently necessary to compute 0z/0x and 
éz/éy from this implicit relationship. To do so, notice that an obvious 
modification of Eqn (5:23) gives 


F ax + Lays T az = 0, 
dy Oz 


but if z could be obtained explicitly, so that z = z(x, y), it would also follow 
from Theorem 5:19 that 
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Substitution of this result into the above expression gives 
of of =) ( of of 2) 
-— Jd =—-~]dy = 0, 
(+ oz Ox van oy a oz ὃν 4 


and as x and y are independent variables, dx and dy are arbitrary so that this 
expression can only be true if 
i of oz 


of ΘΓ az 
oe oo ane oy ΤΣ, - 
Hence, we find that provided éf/éz + 0, 


Z--(H/Q) ὦ =--(H/e 
ax \ex//\az) ἢ ay (Z (2) 
We state this in the form of ἃ further corollary. 


Corollary 5:19 (Ὁ) If the real variables x, y, and z are related by the implicit 
equation f(x, y, 2) = 0 and the first-order derivatives of f exist and are 
continuous, then 


== ΠΛ ἢ = §-- BV) 
when éf/dz - 0. 


Example 5-19 

(a) Find dy/dx given that x2y + sin xp = 0. 

(b) Prove that (d/dx)(x") = rxt-1 when r is rational. 

(c) Find éz/éx and déz/éy given that f(x, y, z) = x2 + 2xyz + 23, 
Solution (a) We must apply Corollary 5-19 (a). As, in this case, 


T(x, y) = χἕν + sin xy 
it follows that 


=2xy + ycos xy 
and 
z= x? + X COS xy. 
Hence, by Corollary 5-19 (a), 
ee Is (eee, 
dx (Of/ey) x2 + x cos xy 
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whenever x? + x cos xy £0. 

(Ὁ) We have already shown in Theorem 5:2 that if y = x", then dy/dx 
= ΠΧ" for n a positive or negative integer. Now we must show this result 
is still true if the power involved is rational. 

Let y = x" with r = p/q, where p and 4 are integers without any common 
factor. Then y = x?/@ implies, and is implied by, y¥ = x?. Let f(x, y) = γ΄ — xP 
so that our equation corresponds to f(x, y) = 0. Then there clearly exist 
pairs of real numbers (x, y) for which y? = x?, and by Theorem 5:2, δή ον 
= gyt-1 £0 when y £0 (that is, when x 4 0), and both df/éy and af/ex 
== — pxP-1 are continuous functions. Hence the conditions of Corollary 
5-19 (a) are satisfied so that by the second form of its statement we may write 


ἂν pxPlt p xP Ρ 


ae  --- St ΧΡ -- py], 


when x + 0. In the event that x = 0 we have 

d , ΧΡΩ͂ --͵ρ 

— (xP/2) = lim ————> 

dx r=0 @-0 ὦ 
whenever this limit exists, which it does when Piq > 1, and is then equal to 
zero. This establishes our desired result for all x. 


(c) Here, 
f(x, y, 2) = x2 + Axyz + 28 
and so 
of ΝΠ af _ 
ae 2x + 2yz, By = ἌΧ; ls 2χγ + 325. 
Thus by Corollary 5:19 (b), 
Oz ( 2x + “S| | Oz --2χΖ 
— = -τι[:-------- ὦ --5τςς-----: 
Ox 2xy + 323 ὃν 2xy + 325 


5.7 Envelopes 

A simple and useful application of the total differential is to the problem of 
the determination of envelopes already touched upon in Section 2-5. Before 
proceeding with this application we now formally define an envelope. 


DEFINITION 56. Leta family of curves [in the (x, y)-plane with parameter 
a be defined by the implicit equation 
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T(x, y, α) = 0. 


Then the envelope of the family I’, when it exists, is that curve & which is 
tangent to every member of the family. 


Figure 5:16 (4) shows some representative members of the family Γ 
corresponding to values «1, 2, «3, and a4 of the parameter «. Figure 5-16 (b) 
shows the same situation on closely neighbouring curves C; and ( when the 
parametric value for C2 is ἀρ + da which differs only by the differential dx 
from the parametric value ap appropriate to C1. We shall assume that the 
curves C; and Ce intersect at the point P with coordinates (xo, yo). 


(b) 


Fig. 5:16 Construction of envelope: (a) envelope of family of curves; (b) neigh- 
bouring members of the family. 


Setting u = f(x, y, «), and regarding x, y, and a as variables, it follows 
from Theorem 5-19 that 
C C 4 
du = ε dx + a dy +- ΕἸ a. 
oy θα 


Ox 


and as the family is defined by setting μ = 0 (constant) it then follows, as in 
Egn (5:23), that 


ὃ 
oa Le 
x 


of of 
Ε — ἂν + —daza. 
hay 1 ἜΣ da 


This equation which relates the differentials dx, dy, and da to the neigh- 
bouring curves C; and ( is, in particular, true at P. We signify this by 
writing 

L) ‘of of 

=) dv —| dy --- = 0 : 
where (Π)ν denotes that the associated quantity is to be evaluated at P. 


This equation is just the intersection condition for curves Ci; and ( at P. 
As it is required of the envelope & that it be tangent to every member of 
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the family 1" it follows that as da --» 0, so curve Ci; must tend to C2 and the 
gradient of the envelope & at P must tend to the gradient of the tangent to 
C, at P. To compute this we use the fact that « = ἀρ is constant for curve C; 
so that the argument that gave rise to Eqn (5-24), when applied to 
Μὰ, ν, αὐ) = Ὁ gives the tangency condition 


Ἵ af) 
0-- ς--Ξ] dx Ὡς} ἣν; 5.25 
(Zz p “= ζ D } Oe 


Now both Egns (5-24) and (5-25) must be simultaneously true for & and, 
consequently, we arrive at the condition 


(2) ees: 


which, since in general da is a non-zero differential, can only be true if 


(L) = 526 


In addition to this result, the fact that P is a point on C; implies that 
(Χο, Vo, %0) = 0 or, equivalently that 


Lf, y, p= 0. (5-27) 


Both conditions (5-26) and (5-27) must be satisfied if the envelope & 1s to 
pass through P and be tangent to C; at that point, so that dropping the suffix 
P, we see that & is the locus of all points for which 


7 
S(x,y, α) =0 and a, y, «) = 0. (5-28) 
οι 
Elimination of « between these two equations gives a relationship between 


x and y which is the desired equation of the envelope &. We have thus proved 
the following result. 


THEOREM 5-20 (envelopes) When it exists, the equation of the envelope 
& of the family of curves 


F(x, y, a) = 0 
with parameter α is determined by the elimination of « between the equations 


Co 
I(x, y, «) =0 and 55.1.6, y, x) = 0. 


Example 5.20 Determine the envelope & of the family of curves 


α -- α)Σ + (y +)? = 1, 
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Fig. 5.17 Envelope of circles. 


with parameter α. 


Solution If we write the equation of this family of curves in the form 
f(x, y, α) = 0, 
then we must set 
fx y, ὦ = (x -- 2 + (y + oP = 1. 
Hence the equation éf/é«a = 0 corresponds to 
το -- αὐ +(y+o)=0 
or, equivalently, to 
a = ἐ(χ — y). 


To determine the envelope, the conditions of Theorem 5-20 require that 
St (x, y, αὐ = 0 simultaneously with @f/é6a = 0. Hence substituting for the 
parameter « arrived at above from the condition éf/éa = 0 into the family 
of the curves f(x, y, «) = 0 gives 
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Mx ty +e tye =I 
Or, 
xty=tv2. 
The desired envelope & thus comprises the two straight lines 
y=ev2—x and y= --ΚΧ1} -- χ. 


This result could also have been deduced by geometrical arguments as 
follows. The original family of curves comprise circles of unit radius, each 
with its centre at x = α, y = —a. Consequently, the tangents to these 
circles which form their envelope & must be straight lines parallel to the line 
of centres y = —x and separated from it by a unit distance (Fig. 5-17). 

Although in this case it was possible to eliminate « from the equations 
arising from Theorem 5:20, this situation is not generally possible. In the 
next example we illustrate how on occasions « may be retained in a form 
which allows the equation of the envelope to be expressed in parametric form. 


Example 5:21 Find the envelope of the equation 


“00 
1 + a? 


where « 15 a parameter. 


(x — a)? + γὲ = 


> 


Solution We again write the equation in the form 


7, ys α) = 9, 


where this time 


ας 
IX, 3) = — Ot" =a rary 
Then 
of 2a 203 
Ge ee ee eae 


and hence the condition éf/@« = 0 requires that 


3 
Gicoig) ee τ ὡς ς 
(l+a?)? 1+? 


Now this is a specially simple situation because y is absent from the equation 
θῇ δα = 0 which allows us to solve immediately for x in terms of « to get 


Χ Ξε 8 ΓΞ ᾿ (Α) 
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To find the envelope ὅδ᾽, Theorem 5-20 requires that in addition to satisfy- 
ing the condition ¢éf/é« =O we must also require that f(x, y, «) = 0. 
Using the form of (x — «) given above this is easily seen to be equivalent to 
requiring that 


a3 a 2 ; αἢ 
- τι -Ξ 


This may now be solved for y in terms οὗ « to obtain 


+«°(3 + 3q2 + a4)l/2 
rr we 


The coordinates (x, y) of points on envelope & are thus determined in 
terms of « by equations (A) and (B). Although it is not possible to eliminate 
a between these equations to obtain an explicit representation for the envelope 
& in terms of x and y, this is of no real importance as we have obtained the 
equations of & in parametric form which are equally satisfactory. Different 
values of « will determine different points (x(a), y(a)) on the envelope &. 

This example has in fact provided the detailed solution to the problem 
first studied in Section 2:5. Notice that for large values of « we have x > « 
and y— +1, as was deduced from purely geometrical considerations when 
the problem was first examined. | 


5:8 The chain rule and its consequences 


If, in Theorem 5-19, the variables x1, x2,. . ., Xn are specified in terms of a 
parameter ¢, say, then the result requires slight modification. Suppose that 


x1 = xi(t), χὸ = xo(t),. . ., Xn = Xn(b), 


which are all differentiable functions of t. Then the variable u becomes a 
function of the single real variable ¢ for we may write 


u = F(t), (5-29) 


where F(t) = f(xi(t), x2(t), . .., Xn(t)). 

Hence by an obvious adaptation of Eqn (5-11) defining differentials we 
may write 

du = F'(t)dt, (5-30) 


where, of course, F’(t) = du/dt the derivative of u with respect to ἢ. 
However by a further application of Eqn (5-11) to each of the variables 
X1 = xi(t), χὰ = xXe(t),. . ., Xn = Xn(t) we have the result 


dx1 dxe ἄχῃ 
Αχὶ = (=) dt, dx2 = (*) dt, oe 4 dxn = (= dt. (5-31) 
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Substituting these expressions for the differentials dx; in terms of the 
differential dt into the statement of Theorem 5-19 gives 


ὃ 
au -- (2S ot EN αἱ 


= 5.32 
Ox, dt Oxg dt Oxn dt (632) 


Finally, a comparison of Eqns (5-30) and (5-32) shows that 


: of dx) of dxe of dxn 
ἀρ ἘΞ ΡΥ Oxa dt Y Ox, dt 
As F(t) = du/dt, this result facilitates the calculation of du/dt without the 
need for formal substitution into u = f(x1, X2,...,Xn) of the values 
x1 = xi(t), x2 = xe(t), 2 0 oy Xn = χα). 
We have proved the following useful result. 


THEOREM 5-21 (chain rule for partial derivatives) Letu = f(x1, x2,.. ., Xn) 
be a real valued function of 7 real variables and let its first-order partial 
derivatives exist and be continuous. Further, let each of the variables x1, xe, 

. +» Xn be a differentiable function of the single real variable ¢ so that we 
may write 


x1 = χι(), X2 = Xe(t),. .., Χῃ = Xn(t). 
Then the total derivative of u with respect to 1 15 given by 


du of dx, of dxe Tey of dxn 


—— ee αν 
ve 


ἃ; 0x, dt | xe dt ὄχ, at 


Two special cases of this theorem are of sufficient importance to merit 
recording as corollaries. The first arises when f is a function of only two 
variables between which an explicit relationship exists, and the parameter ¢ is 
identified with one of these variables. 

As only two variables are involved we shall avoid the use of numerical 
suffixes by agreeing to write x1 =x and x2 = y where, by supposition, 
y = y(x) is some known explicit relation. The statement of Theorem 5-21 
then becomes 


ἀμ δῇ ἀχ Of dy 

dt ὃχάϊ ὃνγ αἱ 
If, now, we identify ¢ with x, then t = x and dx/dt = 1, dy/dt = dy/dx so 
that the above result becomes 

ἂμ ὃΓ δ΄ dy 

ἀχ ὃχ ὃν ἀχ 
The expression on the right-hand side is the total derivative of u with respect 
to x. The first term on the right takes account of the change directly due to x 
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whilst the second term takes account of the fact that γ is itself a function of x. 
This result enables du/dx to be obtained without needing to substitute 
y = y(x) in the relation u = I (%; y)- 


Corollary 5:21(a) If u= f(x,y) is a real valued function of the real 
variables x and y with continuous first-order derivatives and y is related to 
x by the explicit equation y = y(x), then 


More generally, suppose that u = f(x, y) whilst x and y are related 
implicitly by the equation 
g(x, y) = 0. 


How must we modify our previous argument in order that we may compute 
the total derivative du/dx? The result of Corollary 5-20 (a) is still true but 
obviously dy/dx now depends on the form of g. To find the form of dy/dx we 
can use Corollary 5-19 (a), writing f = g, to see that 


ὃ--ἀγ 

ἄχ Ox / oy 
showing that 

£-2- Qe 

dx ax (ζ (Ξ ὃν : 


provided ég/dy τέ 0. We state this as our next result. 


Corollary 5:21 (Ὁ) Ifu = f(x, y) isa real valued function of the real variables 
x and y with continuous first-order derivatives, and y is related implicitly to 
x by the equation g(x, y) = 0, then 


ae ~ i (5)(a)/() 


provided dg/dy + 0. 


Example 5:22 Determine the derivative du/dt given that 
u = sin (x? + y?) with x = 3t, y = 1/1 + ¢?). 


Solution We must apply Theorem 5:21 making the identifications x1 = x, 
xq = y, and f(x, y) = sin (x? + y?) with x = 31: and y = 1/(1 + ¢?). Hence 
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δ ὃ | 
οἱ = 2x cos (x? + y?) τ = 2y cos (x? + y?) 


Ox 

whilst 
dx dy «--2| 
de dt (14222 


Substituting in Theorem 5-21, 

d — 

= = 2x cos (x? + y?) , (3) + 2y cos (x? + y?). head 
or 


du 2yt 
eas 2 2 os tae eee 
rr 2 cos(x? + y?) [3 (+. πε, 


Using the known relationships between x, y, and f¢, the derivative du/dt can 
thus be computed for any desired value of ἡ. The details are left to the reader. 


Example 5-23 Determine the total derivative du/dx in each case: 
(a) u=xcosy+ycosx wheny=1+x + x3; 
(b) u=x2+4+2xy— y? when x? + y?+ cos xy = 0. 


Solution (a) This requires an application of Corollary 5-21 (a). We set 
f(x, y)=xcosy+ycosx and y=1+x+x3 


so that 

δ ἱ sin + cos 

— <= — —_ = ] 

Ay, cos y — ysin x, By x sin y OS x 
and 

dy 

— =] + 3x2, 

dx ae 


Hence, substituting into Corollary 5-21 (a), 
du 
τς Ξ 505} — y sin x + (cos x — x sin y)(1 + 3x?). 
(b) In this case we use Corollary 5-21 (b), with 
f(x, y) = χ + 2xy—y? and g(x,y) = x2 + y? + cos xy. 
Hence 


af af 
ὃς on ἘΦ», ye 
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ὃ 
B= 2x — ysin xy Ὁ, ΞΞ DY -- xin xy 
Finally, applying Corollary 5:21 (b), 


2(x — y)(2x — ysin xy) 


μ 
Ξε 2(χ + γ) -- (2y — x βίη xy) 


dx 


§-9 Change of variable 


This section discusses a somewhat more complicated situation than that 
covered by Theorem 5:21, namely, the implications on partial differentiation 
of changing the independent variables in a function u = f(x1, x2,. . ., Xn) 
that is to be differentiated. This situation commonly occurs as a result of 
changing coordinate systems to suit physical problems as the following 
example illustrates. Suppose that p = p(x, y, z) is the pressure in a fluid 
flowing parallel to the z-axis. Then @p/éz is the pressure gradient along-the 


direction of flow and ép/0x, dp/éy are the transverse pressure gradients in the 
plane z = constant. 


‘5 (r, 0,2’) 
ἜΣ (x »J> z) 


Fig. 5:18 Cylindrical polar coordinates. 


Now, if the flow takes place in a rectangular duct with sides described by 
x = constant, y = constant, then the Cartesian coordinates O{x, y, z} are 
obviously the natural ones to use. However, if the flow takes place in a 
cylindrical pipe, then the z-axis is still convenient as it can be aligned with the 
axis of the pipe, but the x-, y-axes are now less useful since the wall of the 
pipe becomes the curve x2 + y? = constant. Clearly, a more sensible coordi- 
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nate system would be the cylindrical polar coordinates r, 6, z’ in which r and 
θ define a point in the plane z’ = constant. Figure 5-18 illustrates this idea. 
Plane z = ζ΄ = Ο in both the Of{x, y, z} and Ofr, 6, z’} systems of axes, and 
is denoted by II. Relative to these two systems the point P has the coordi- 
nates O{x, y, 2) and Of{r, 6, z’}, respectively, where 


x =rcos 6, y=rsin6, Ζ ἘΞ Ζ΄. (5-33) 


How can the pressure gradients described by the partial derivatives 
Opler, Op/e0, and ép/dz’ be determined from Eqn (5:33), and the known 
functions dp/0x, dp/éy, and Op/dz. The rest of this section is devoted to solving 
this type of problem. Notice that from the definition of partial differentiation, 
dp/éz and dp/éz’ have essentially the same meaning, whereas dp/ér is the 
derivative of p computed along a radius with 0 and z’ held constant, whilst 
ép/e0 is the derivative of p tangential to a circle r = constant drawn on the 
plane 2’ = constant. 

Although the replacement of coordinate variables in this manner involves 
replacing a set of n independent variables by a new set also comprising ἢ in 
number (Ἱ = 3 above), we shall first prove a more general result. Specifically, 
consider the implication of the situation in which 


u = f(x1, er Xn), (5:34) 


when the independent variables x1, x2,. . ., Xn are themselves differentiable 
functions of another set of variables which we denote by ai, a2, . . ., ἄπ) 
It is not necessary that m should equal πη. Thus we have 


X1 = X1(H1, 2, . . ., Xm), 


» ἘΞ χορί(α;, XQ, . 0 oy Om), (5:35) 


Xn = Xn(a1, tra Xm), 


If the variables x; in Eqn (5-34) were to be replaced by the equivalent functions 
(5-35) involving the variables o;, then f would become some function 
F(a1, &2,. . ., &m) Of a1, &2,. . ., &m 50 that by Theorem 5-19 we could write 


OF OF OF 
dye dash dae de. (5-36) 
On One Com 
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Next, observe that by applying this same theorem to the equation for 
χε in Eqn (5-35) we obtain 


o O δ 
ὦ (5:37) 
CX, One Com 


fori=1,2,.. ... 


Substituting these expressions into the statement of Theorem 5-19 then 
gives 


ὃ 8 ὃ 
ar Ta ὃ ὄχηᾳ 
glee = de εἰ a  ἢ dam | (5-38) 
OXn | δαὶ One Cam 


On re-arrangement this becomes 


of Ox) of Ὅχο of OXn 
dua |.-Φ..- τ᾿, Ὁ σςὶἢ ΡΞ ΕΝ ὧν li 
" | Oxy ” Oxe θαι " OXn =| ὰ : 
Of Ox of a of @ 
[oe Maer ig πὰς (5-39) 
ὄχι O%m ὄχο Cam OXn O&m 
Since (x1, x2,..., Xn) = F(a1, «2,.. ., Om), it follows by a direct 


comparison of the ith terms of Eqns (5-36) and (5-39) that 


δ Of ax Of ex Of ex. 
ys, ᾧ δι Ea es Sf ὅχε (5-40) 
Car ὄχι doy Oxe Gag OXn θαι 

fori=1,2,...,m. 


We state this result in the form of a general theorem. 


THEOREM 5:22 (change of variable) Let f(*1, X2,. . ., Xn) be a real valued 
function of the real variables X1, X2,. . ., Xn Whose first-order derivatives exist 
and are continuous. Further, let x; = X1(01, %2,..., Om), χὰ = X2(01, 2, ..., Om), 

εν» Xn = Xn(H1, %2,. . ., om) be differentiable functions of the real variables 
1, X2,.. ., Om, then 


of of ὃχι af Oxe | neat Of axn 
du, 8x, Oa, | axe da4 OxXn θαι 
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of of ors. es nae of OXn 
One δ Ox1 One Ox da2 OXn One 


of ΒΕ of Ox1 of OXe : ae apa of OXn 
Cum ὄχι Ctm 8x2 Gam OXn Bom 


Example 5:24 Express of/ér, 0f/00, and ΘΖ’ in terms of 0f/0x, of/ey, and 
6,22 given that x = rcos @, y = rsin 0, z = z’. Find their values given that 


S(x,y, 2) = χϑ + Bxy + y? + 2%. 


Solution We must apply Theorem 5.22 with m =n=3 by making the 
identifications x1 = x, x2 = y, X¥3 =z and a =r, δ = θ, a3 =z’. Our 
first result is 

δ of ox δ ὃ Of oz 

f_ of ox oy, Fa 


Or Ox or ὃν Or Oz Or 


of _ of ax , Of ὃν , Hf a 
a6 ax 06 ὃν 00 az a 


MIE Sr Boe 
ὃζ' ὃὃὸχ @z' ὃν ὃζ'΄ ὃζΖ Oz' 


However, | 
ὃ ς Ox O 
— = cos § — = —rsin§ — = (, ΟΣ = sin 6 
Oz or 
dy y Ox oy 62 
—=rcs#, —=07 —=-—=49, —=1 
a6 ὴ az’ az! oz" az 


Hence, substituting these values into the above transformation equations 
shows that 


_—_ τς os 6 + a sin 6, 

or oy 

f Ff . of 

a or 
f of 

dz’ az 


Next, using the fact that f(x, y, z) = x? + 3xy + y? + z? we see that 
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af of _ δ᾿ 

ὃς = 2x Ὁ 3», nee = 22, 
so that 

of 

= (2x + 3y) cos 6 + (3x + 2y) sin 0. 


However, as r2 = x? + y2andcos θ = x/(x? + y?)!/2, sin 8 = y/(x? + y?)l?, 
this result simplifies to 


of _ 2x2 + 6xy + 2y? 
or (x2 4 y2)l/2 
A similar calculation shows that 


of a 
ΠΕ = 3(x2 — y2 oleae 

Consider the special case of Theorem 5:22 that results when m = n = 2, 
so that its statement becomes 


of = of Ox1 of Oxe 
θαι ὄχι δαι axe δαι 
(5.41) 
of af ὅχι af axe 
Ba, 0x1 0ug axe Cag 


Now for any differentiable function f(x1, x2), once the variable change has 
been decided, these equations express the partial derivatives f, , ζω, in terms 
of fz,, fz, which we suppose to be known. However, if @f/@a1 and δῇϑα: are 
supposed known, then Eqns (5-41) can be regarded as simultaneous equations 
for fr,, fz,. Thus, provided the simultaneous equations can be solved, Eqns 
(5-41) may be regarded as describing a one-one transformation, or mapping, 
between partial derivatives of f with respect to (x1, x2) and (a1, 2). 

It is easily seen that provided J(x1, x2) 4 0, we have 


Ff _ (2 = = ἘΞ [τὸν x2), 


0x1 0x1 θα Axe θὰ 
(5-42) 
ὃ Ox1 ὃ 0x1 ὁ 
a (= cs aes 2) face x2), 
axe Ga, 0ag Ong Ox 
where 
0x1 Ox2 
@x1 Ox x1 Ox θαι θαι 
Ἰὰς, ἀ ει Ξε τ τοςς, ᾿ (5.43) 
θαι θαΣ O42 θαι | Ox, Axe 


Cag Cane 
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The expression J(x1, x2) is the Jacobian of the transformation and is 
usually written in the form of the functional determinant shown in Eqn 
(5-43). If the Jacobian vanishes at any point in the (αι, %2)-space then at such 
points the transformation we are discussing obviously becomes invalid and is 
singular, This is because at such points there is no longer any relationship 
between partial derivatives in the two coordinate systems. In more advanced 
discussions, the Jacobian is shown to play a fundamental role in all matters 
relating to changes of variable. Sometimes, to emphasize the variables in- 
volved, in place of J(xi, xz) the alternative notation @(x1, x2)/@(a1, a2) is 
used, This idea is readily extended to more than two independent variables as 
would be appropriate in Example 5:24, where three variables are involved. 

The non-vanishing of the Jacobian is thus seen to provide an essential 
condition for the partial derivatives of any differentiable function f, with 
respect to (x1, x2) and (a1, «2), to be interchangeable by virtue of Eqns (5-42). 


Example 5:25 Find the Jacobians of the following transformations and 
state where, if at all, they vanish: 

(a) x=rcos#,  y=rsin@ (polar coordinates); 

(b) x =u+y, yru-—v; 

(c) x =3u%2+ v2, p=ut+v. 


Solution 


Oia s cos @ sin 6 


—rsin@ rcos@ 


O(x, y) = 
ar, 6) 

Hence in the case of polar coordinates the Jacobian vanishes at r = 0 
(that is, at the origin) which is the only singular point of the transformation. 


== r(cos? 0 + sin? θ) = r. 


ax,y) [11 
a = = —2. 
Ores? δίμ, Ὁ) |1 - 
This Jacobian never vanishes so that the transformation is always permissible. 
Ox,y) |6u 1 
= — = oe 2 ° 
(c) J(x, y) τα, ἢ ee 6u v 


The Jacobian vanishes when 3u = v, so that the transformation is invalid, or 
singular, at all points on that line in the (u, v)-plane. 


5:10 Implicit functions 


We have already used implicit functions when discussing various consequences 
of total differentials, and will now examine these ideas more closely. Consider 
the equation f(x, y) = 0. Often the argument is used that from this implicit 
function of x and y we can, in principle, solve for y, and as y depends on x, 
we are entitled to express y in the explicit form y = (x). 


SEC 5:10 IMPLICIT FUNCTIONS / 249 


Suppose that f(x, y) = x? + y?+ 1. Then πὸ real values of x and y 
satisfy the implicit equation f(x, y) = 0, so certainly in this case one cannot 
solve for y. Thus a necessary condition that we may solve for y near to some 
point P with coordinates (xo, yo) is that there are real numbers xo, yo such 
that f(xo, yo) = 0. 

Now let u = f(x, y) be the graph of f(x, y), and assume that f; and fy 
exist and are continuous so that the graph will be a smooth surface of the 
type shown in Fig. 5-19. Then f(x, y) = 0 is the curve of the section of this 
surface by the plane uw = 0. In general the curve of the section will be similar 
to the smooth curve L shown in the figure and can be described by an equa- 
tion of the form y = (x). This will obviously be the case provided firstly, 
that the surface u = f(x, y) and the plane uw = 0 intersect and secondly, that 
they are nowhere tangential. The curve L will be smooth, and the function 
g(x) differentiable, because the assumed continuity of the derivatives ἧς and 
fy will ensure that the surface u = f(x, y) is itself smooth, and so will generate 
a smooth curve of section. This is, of course, the assertion made in Corollary 
5:19 (a). Let P be a representative point on L with coordinates (xo, yo) in the 
u = 0 plane, and line / be drawn tangential to the surface u = f(x, y) at P 
in the plane x = xo. Then by Definition 5:5, the angle « between line / and 
the plane u = 0 is such that tan a = éf/dy|(,, 


yo)" 


Fig. 5-19 The function y = ¢(x) defined by the intersection of uv = f(x, y) and the 
plane u = 0. 
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Hence the condition that the surface u = f(x, y) and the plane u = 0 
should not be tangential at P is seen to be fy(xo, yo) #0. Collecting our 
results we now formulate them as the following theorem. 


THEOREM 5-23 (implicit function theorem) Let f(x, y) be differentiable and 
have continuous first-order partial derivatives near to (xo, yo) at which 
I (Xo, yo) = 0 and fy(xo, yo) τέ 0. Then, near (xo, yo), it is possible to solve the 
implicit equation f(x, y) = 0 uniquely for y in the explicit form y = g(x), 
where ¢(x) is differentiable. That is, near to (xo, yo), f(x, g(x)) = 0. 


Notice that this theorem is only of the existence type in that it ensures 
that an explicit representation y = q(x) exists, but gives no information on 
how such a representation may be found in any specific case. 

As a corollary to this theorem, consider the relationship between the 
derivatives of a function and its inverse. Let F(x, y) = y — f(x), so that 
F(x, y) = 0 implies the relationship y = f(x). Suppose that at some point 
(Xo; Yo) we have f’(xo) 40 and yo = f(xo). Then, noticing that aF/0x 
= (0/0x)[—f(x)] = (d/dx)[—f(*)] = —f'(x) and dF/éy = 1, it follows from 
Theorem 5:23 that close to (xo, yo) we may solve for x as a function of y to 


᾿ obtain an inverse function x = 9(y). That is, F(@(y), y) = y — fip(y)] = 0, 


Furthermore, applying Corollary 5-19(a) to F(x, y) = 0 and regarding 
y as the independent variable and x as the dependent variable, we have 


so that provided f’(x) + 0, we have 


dx ee ; _wier 
ψ - ὦ or φῷ) ΞΙ ΓΟ), 


which is the desired result. 


Corollary 5.23 Let y = f(x) be a real valued differentiable function of x 
close to some point (xo, yo) at which yo = f(xo). Let x = (y) be the function 
inverse to it close to the same point (xo, yo) so that xo = (yo), and let 
7 ΟΟ) # 0. Then close to (xo, yo), we have 


p(y) = Uf’) 


or, equivalently, 


This corollary has two important applications which we mention next. The 
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first application of Corollary 5-23 is to the differentiation of inverse circular 
functions. In Section 2:2, we agreed to write 


ys=arcsinx when x=siny and --πί --Ξ γ -Ξ π|2. 


Now, 
ἝΝ 
ago = cos y Ξε 0 for --π 2 <yr 7/2; 


that is, for —1 < x < 1 and so, by Corollary 5:23, 


ἄν / dx\ 1 1 1 ! 

ἄχ is) cosy V(l—sinty) (1 — x) 

The positive square root has been taken here because the principal branch of 
the function y = arc sin x is a monotonic increasing function of x in its 
domain of definition —1<x< 1. By this same argument, the negative 
square root is taken when differentiating the principal branch of the function 
y = arc cos x which is a monotonic decreasing function of x in its domain 
of definition —1 < x < 1. Thus 

d _ 1 f Ι 

a ΓΞ or~—-Il<x<l. 

Similar arguments establish Table 5-2. In the entries for the derivatives 
of arc cosec and arc sec, the term |x| has been introduced to take account of 
the two separate cases that need consideration when deriving these results; 
namely, when x > aand when x < —a. These same ideas will be encountered 
again in the next chapter in connection with Table 6-3, when they man be 
discussed in more detail. 


Table 5.2 Derivatives of inverse circular functions 


: (arc sin x/a) : (arc cos x/a) = 

— Ω) ΞΞ ---- --..-----.............--- a --Ξὀ᾿-....-.. .. 

ἀχ ἰὼν / (a? — x?) dx / (a? — x?) 
for—-a<x<a for—a<x<a 

d a d —a 

ΤΣ (arc tan x/a) = eee ae (are cosec x/a) = iii =a 

for all x for|x|>a : 

d a d —a 

=o gs ee ee Se Ges t = ee 

ae (arc sec x/a) Siva) a (arc cot x/a) Pax 

for|x|>a for all x 


In Chapter 2 we saw that curves may be described parametrically thus: 
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x= X(t), y= Y(t), (5-44) 


where ¢ is a parameter defined in some interval %. The question that now 
arises is how may we find dy/dx in terms of the functions X(t) and Y(t). 
Let us suppose that X(t) and Y(t) are differentiable functions of ¢ with 
continuous derivatives and that X’(t) 4 0. Then by Theorem 5-23, we may 
solve x = X(t) in the form t = f(x), say, so that then y = Y[f(x)]. From 
Theorem 5-7 on the differentiation of composite functions we have 


dy dYdf 

dx τι Υ[7(Χ}}} = ΠΡ 
ΟΓ, πὰ 

ἂν dy dt 

dx dt dx (0: 
However, by Corollary 5:23, dt/dx = 1/(dx/dt) so that 

dy ἀν dx 

ἀχ dt/ dt si 


Hence, like x and y, the derivative dy/dx is now also known parametrically 
in terms of ¢. 
This result is best remembered in symbolic operator form: 


d Ι d 
ea ieei le, 5-4 
dx  (dx/dt) dt of) 


Higher order derivatives with respect to x may be found either by a 
repetition of the argument leading to Eqn (5-46), or by successive applications 
of Eqn (5-47). 

Thus, using Eqn (5-47), we have 


αν 4 (2) = I Ε (2 =) | 
ἀχξ dx \dx (dx/dt) Ldt \dt/ dt 


or, denoting differentiation with respect to ¢ by a dot, 


ἀν 4 (2) = 1d (=). 

ἀχὸ dx \dx % dt \dx 

Using the fact that dy/dx = »/x and performing the indicated differentia- 
tions gives 

dy ἀν = 99, 

dx2 χϑ 


It is recommended that the reader remembers the arguments leading to the 
operator rule (5-47) together with the rule itself, rather than remembering 


(5-48) 
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results of the form (5-48). 


Example 5:26 If x =¢-+2sint, y = cost determtne dy/dx and d?y/dx? 
and hence deduce their values when t = 0. 


Solution We have 


dx dy 
ein | See ee 
er + 2cost, oF sin ft, 
so that by Eqn (5-46) 
dy =2/2 = —sin t 
dx dt/dt 1+2cost 
When 1 = 0 we have x = 0, y = 1 and 
dy _ —sint _ 
ἄχ], I+2costl,2o ~ 


Next, as 


d2y Ι a (2) 


ἀχξ (dx/dé) dt \dx 

we have 
ἀν ] d —sin t 
dx? 1 +2costdt|1+2cost 


Thus, performing the differentiation and simplifying, 


ἀν -| 2+ cost 


dx2 ss L(I1 + 2 cos 23 

and so 
Go setae ΕΞΞΞῚ ΒΕ 
ἀχϑδ] πὸ (1 + 2 cos 2)3} |, 9 


5.11 Higher order partial derivatives 


If the function f(x, y) is differentiable with continuous first-order derivatives 
fe and fy, then it can also happen that these partial derivatives which are 
functions of x and y are themselves differentiable. Thus we are led to consider 
the further ΝΣ derivatives 


= os > ν eG = Οὐ), and — ν (0). 


These functions, when they exist, are second-order partial derivatives of f 
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and are respectively denoted by 


ae ΜΈ se Oe ἢ 
ax? ayex exdy’ co ey? 


Using an alternative notation we often write these same derivatives as 


Seas fay» ἵνα: and fyy. 
In this notation the first suffix signifies the partial derivative of f that is to be 
differentiated partially with respect to the second suffix. The centre pair of 
derivatives are mixed second-order partial derivatives and it is conventional 
that the order of x and y in corresponding mixed derivatives in the two 
notations is interchanged. Thus we have, 


ce ας, Οὐ, 
δγῶχ = fev on fy) = Oxéy = Suz: 


5) = 

It is important to notice that the double operations of partial differentia- 

tion that lead to the mixed derivatives fry and fyz are performed in different 

orders. Consequently we have no right to expect that the derivatives that 

result will be equal to one another. To emphasize this point we now write out 
in full the limiting operations involved in arriving at fry and fyz: 


0 
Soyo, yo) = By Lic, y)] 


70 + h, yo + k) — f(%o0, yo ΑΚ) 


(X9,Yo) 


= lim + 
k—0 "ΠῚ h 
τῆς S(xo + h, yo) — f(xo, 0) 
h—-0 h 


and so, writing 


2 (xo, yo, h, Κ) = f(xo + A, yo + k) — f (xo, yo + k) — f(xo + A, yo) 


+ I (xo; yo); 
we obtain the result 
ee oe | 
Sry(xo, yo) = lim lim — g(xo, yo, h, k), (5-49) 
k—-0n—0 Ak 


where the inner limit with respect to ἢ is to be taken first. Exactly similar 
reasoning gives the corresponding result 


« 4 I 
Sux(Xo, yo) = lim lim — g(xo, yo, h, k). (5-50) 
hok—+o Ak 


Here it is the inner limit with respect to k that is to be taken first. 
The double limits used in Eqns (5:49) and (5-50) are called iterated limits 
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on account of the fact that they are taken sequentially so that their order is 
important. They are not to be confused with the simple double limit of 
Definition 3-8 into which questions of order do not enter. 

Let us now explore the consequence of requiring one of the mixed 
derivatives, say ἔων, to be continuous. This is, of course, the usual situation. 
Definitions 3-8 and 3-9 imply that if fry is continuous at (xo, yo), then a limit 
L= ν᾿ yo) exists with the property that 


= lim fey(Xo +h, yo+ k), (5-51) 
hoo 


where the question of the order in which the limits are to be taken does not 
occur. Hence, as fry(xo, yo) is also defined by Eqn (5-49) in which an iterated 
limit is involved, the equating of these two results implies that if fry is con- 
tinuous, then the order of the iterated limits in Eqn (5-49) is immaterial. Thus, 
under the stated conditions, expressions (5-49) and (5-50) become identical 
and the continuity of fzy implies not only the existence of fyz, but also that 
Sry = fyx. This establishes our next result. 


THEOREM 5:24 (equality of mixed derivatives) Let f(x, y) be a real valued 
function of the real variables x, y, and let fz, fy, fey exist and be continuous 
in the neighbourhood of the point (xo, yo). Then fyz also exists at (xo, yo) and 


of _ ef 
ὄχον ογοχ 


(τρ,Μ0) (χσρ,Ψο) 


Still higher-order derivatives can be defined by an obvious extension of 
the notation. Thus, for a suitably differentiable function f we may define the 
third-order partial derivatives 


τι, νυν: Seva, Suyys etc. 


If the higher-order derivatives involved are continuous then, by an 
obvious extension of Theorem 5-24, the order of performing differentiations 
may be disregarded. In the case of the mixed third-order peau derivative 
Jeyx this would imply that 


δ 
foe τ 5 ὅν 09) = 5 [ὅς 09] = So 
x 
Hence, under these conditions, it is proper to extend the @ notation by 
writing 


af af af asf 


᾽ ᾽ » --ἰΞ, etc. 
Ox® Oxdy? Ox®dy ay 


Example 5-27 If f(x, y) = x4 + 2x®y? + xy* find the second- and third- 
order partial derivatives of Καὶ 
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Solution First-order derivatives: 

fe = 403 + 4xy? + 4, fy = ἀχξν + 4χγ3. 
Second-order derivatives: 

Sax = 12x? + 4y?, ἴων = 4x? + 12xy?, 


ὃ 
fey = 5 Ua) = Bay + 4". 


This mixed derivative is continuous, and so fry = fyx. As a check in this case 
we compute fyz directly: 


ra) 
Suz = 5, Ὁ) = 8xy + 4y*. 


Third-order derivatives: 


| ὃ 
Juxx = 24x, Fuyy = 24x), Sey = ay (fry) = 8x + 12y?, 


Ο 
Seay = by (fax) = 8y. 


The continuity of the third-order derivatives we have computed ensures the 
existence and equality of the other corresponding third-order derivatives that 
may be defined. Thus, for example, as ἔχων = 8y is continuous, there is no 
need to compute fzyz, since it exists and is equal to frzy. 


Example 5:28 Define the function f by the requirement 


2 2 
ιν Ὁ} if either x 40, or y #0 

fe y=) x+y? 
0 if both x = 0 and y = 0 


Deduce the value of each of the mixed derivatives at the origin. 


Solution We shall use definitions (5-49) and (5-50) for this purpose by setting 
xo = 0, yo = 0 so that 


hk(h? — k?) 
h = ---.-....- τ. 
Then, from Eqn (5-49), 
4. 1 fhk(h? — k?) 
0 = —S {-----ς.ἁ...----.. --- 
5." | he + ke 
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iin = ek =| 1 
=hmim —— = UuwmMmI|—] = ~1. 
ro ho M2 4+ hk? p20 ( k? 


However, because the order of the iterated limits are reversed in Eqn (5-50), 
the same argument also shows that 


0 cain τὴ πὶ 1 

, 0) = lim lim ——; = lim [—} = 1. 

"οί, Ὁ) ro ko δ +k? po (7: 

Thus fz,(0, 0) = —1 whereas f,2(0, 0) = 1. This occurs because the functions 
fey and fyz are not continuous at (0, 0) as may be checked by direct calculation. 


PROBLEMS 
Section 5-1 


5-1 Give examples of four physical quantities that are essentially defined in 
terms of a derivative. 


5-2 Use Definitions 5-1 and 5-2 to prove that the following functions are differ- 
entiable in the stated intervals and to compute their derivatives. Evaluate 
these derivatives for the stated values: 

(a) f(x) = 3x? in [0, 3), find /’(2); 

(Ὁ) f(x) = 2x3 + x + 1 in [—1, 4], find (Ὁ); 
(c) f(x) = | x | in (0, ©), find (1); 

(ἃ) f(x) = | x | in (-- ο, 0), find f(—3); 

(e) f(x) = 1/x in [1, 5], find f°); 

(Ὁ f(x) = χά in (0, 0), find (2). 


5:3 Prove that f(x) = | x | is not differentiable at the origin. 


5-4 Consider the graph of f(x) = x? + x + 1. Let x1 and x2 be two points on 
the x-axis with the property that the gradient dy/dx of the curve y = f(x) at 
x = x2 is four times the gradient at x = x1. Derive the algebraic equation 
connecting x1 and χα and deduce that | x2| > 1. 


5:5 Deduce the gradients of the functions f(x) to the immediate left and right of 
x = 1 given that: | 
x8+x+1forx>1 
a = 
(a) fx) oi x — x*forx <1; 
x8—x-+3forx>1 
2x + 1 for x < 1. 


5-6 Prove that the function f defined by f(x) = x* sin (1/x) for x #0 and 
f{() = 0 is differentiable at ‘the origin and find the value of its derivative 
there. 


(b) fi) = | 


5:7 Prove from first principles that d/dx(cos «x) = —asin ax. 


5.8 At which points in the stated intervals, if any, are the following functions 
f(x) non-differentiable: 


(a) f(x) = x + sin2x ἴογ 0 <x <7; 
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x + 1/x for x £0 
b = in the i 1 [— : 
(b) f(x) 5 eee 2 in the interval [—1, 1); 
1 for x rational 
() f@) = | ere in the interval [0, 1]. 
0 for x irrational 


5:9 The function f(x) is defined on the interval 0 < x < 1 by the expression 
sin2x forO<x< ἐπ 
{= 


ax + bforin<x<l. 


Deduce the values of a and ὁ in order that the function should be continuous 
and have a continuous derivative at x = ἐπ. Interpret these conditions 
geometrically. 


510. Give an example of a continuous function f defined on the interval [0, 5], 
that is differentiable everywhere except at x = 1 at which point the left-hand 
derivative is 3 and the right-hand derivative is 5. That is to say, the tangent 
line to the graph drawn to the left of x = 1 has gradient 3 whilst the tangent 
line to the graph drawn to the right of x = 1 has gradient 5. 


Section 5-2 


511 By assuming Theorem 5-2 is also valid for rational where necessary, find 
the derivatives of the following functions, stating at which points in their 
domains of definition, if any, they are non-differentiable: 


x/3 +. cos 3x, for x 40 
ΟΕ ᾿ for χξξο 
(Ὁ) f(x) = x sin 2x + x5/8 for -1 <x <3; 
(c) f(x) = | cosx|forO<x< π. 


512 Use Theorem 5-4 to give an inductive proof that, if ki, ke,. . ., kn are con- 


stants and /i(x), f2(x),. . ., fr(x) are differentiable functions in the interval 
a<x <b, then 


ἃ. = ; 
ας Ki fila) = Σ ke fi’) ina<x<b. 
i=1 t=1 


in the interval --ἀπ <x < π; 


5:13 Differentiate the following functions: 
(a) y= x¥3 sin x; 
(b) y = (x? + 3x + 1)(1 + cos 2x); 
(c) y = sin 6x cos 2x; 
(d) y = (x8 + 2x — 1) cos 3x. 

514 Differentiate the following functions by making a repeated application of 
Theorem 5.5: 
(a) y = (1 + x?) sin 7x cos 4x; 
(b) y= ( + 2x2 + x48; 
(c) y = cos? 2x; 
(ὦ y = (1 + x3)? sin? 3x. 

515 Differentiate these composite functions: 
(a) y = (x? + 2x + 1); 
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(b) y = (a + bx9)/8; 

(c) y= (2 + 3 sin 2χ)5: 
(4) y = sin (1 + 2x); 

(0) y = sin [sin (1 + x*)]; 
(Ὁ y = cos (1 + x4)¥/?. 


5:16 Differentiate these quotients: 
(a) y = (x? + 3x + 7/(x* + 1); 


_ sin(l + x?) | 
(Ὁ) Y= a Oe + 6° 
1 1 
oe as 3cos?x  cosx’ 
_ tan (1 — x2 + x4) 
Dy in@th 
i τς δ 
(9) »- 1— Vx 
5:17 Differentiate these functions: 
1 
@ Y= GF eos 
- x ᾿ 
OY χει x 
oye tan (1 + x? + x4). 


sin(1 + x?) ᾿ 
(d) y = cosec? (1 + 3x); 
__ sin x + 2 cos x, 

sin x — 2cosx’ 


3x — 1) 
(f) y = cot Fest 


5-18 If the functions τι), f2(x), g1(x), and g2(x) are differentiable, show by direct 
expansion that this theorem is true: 


d | fate) fel) fi) 60) 
dx | gi(x) ga(x) σι) με) 


(e) y 


ΕἾ βοὴ felx) 
σι) σε) 


Apply this result to differentiate the determinants: 
(a) (b) | (1 + x2 cos x) (2 — sin? x) 
(1 — x2cos x) (2 + sin? x) 


x2 xx sinx |. 
3 ° 


COs Xx 1 


5:19 Suppose that the functions /fi;(x), with i, 7 = 1, 2, or 3, are differentiable func- 
tions of x. Prove, by means of Problem 5:18, that 


d fir) fix) fas) μι) fre’) fis(x) 
ας [Διὸ fee) με) | = fale) faolx) 86) | + 
far(x) fax) 880) fax) fax) fas(x) 
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μιαὺ μοί) — fis(x) μι) fist) —_fis(x) 
+ | far'(x) fer'(x) fes’(x) | + | fart) faolx) fea(x) |. 
failx) faelx) f(x) fav) με) 8.) 
Section 5-3 


5.20 Use the intermediate value theorem to prove that if f(x) is continuous on 
[a, δ], with f(a) and f(6) having opposite signs, then there must be at least 
one point x = ξ, witha < & < 5, for which f(é) = 0. 


5.21 Why is it not possible to conclude from the intermediate value theorem that 
if f(x) = 1/1 — | x |) for | x | τ 1 and f({ 1 |) = 0-5 then 
(a) there is no point x = ¢ in the interval [0, 6] for which f(é) = 0; 
(b) yet there is a point x = ἡ in the interval [—11, —2] for which f(y) = 
--Ο' 52 Identify the point on the x-axis giving rise to this functional value. 


5:22 The function f(x) = 4x3 — x + 2 which is defined in the interval (— οὐ, 00) 
has extrema at the points x = 1, x = —1. Identify their nature by considering 
the behaviour of the function close to these points. Are they relative or 
absolute extrema? | 


5:23 By considering the behaviour of f(x) = sin $x cos 3x in the neighbourhood 
of x = ἐπ, show that the function attains an absolute maximum at that point. 


5:24 By considering the behaviour of y = x? — 2x + 3 in the neighbourhood of 
x = 1, prove that this point gives rise to an absolute minimum of the 
function. Find its value. 


5:25 Find the critical points of the function f(x) = x3 — x? — 4x + 4. Identify 
the nature of the extrema associated with them by considering the functional 
behaviour close to each of these points. 


5.26 Find the critical point of the function f(x) = (x — 1)x?/3 and identify its 
nature. Do the points x = —1, x = 0 correspond to extrema of the function 
and, if so, of what type are they? 


5:27 Find the critical points of the function f(x) = x2(3 — x)?. 
5:28 Identify the critical points and extrema of the function 
w— 3x+2 forrO<x* - 2:5 
fo) = | 
χὸ -- 7x + 12 ἴογ 2.5 <x <5. 
5:29 Apply Rolle’s theorem to the following functions where it is applicable, and 


hence determine at how many points in the stated intervals [a, b] the following 
functions satisfy the result of that theorem: 
(a) f(x) = x? — 1 in [—2, 2]; 
(Ὁ) f() = 1+ sin x in [—27, 3 π]; 
(Ὁ fi) = 1/0 + |x|) in[-1, 1]; 
2+ 3x+2for-1<x<0 

(d) fix) = [FFT = 

x?—3x+2for0<x<l. 


5:30 Give an example of a simple continuous function g(x) of the type illustrated 
in Fig. 5-9 (b) in which ¢’(£) = 0 for some point in an interval [a, δ], but to 
which Rolle’s theorem is inapplicable because g(x) is non-differentiable at 
one point of that interval. 


5°31 


5:32 
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Show that the conditions of the mean value theorem apply to f(x) = x + 
sin x for the interval [0, 47]. Find the value of ξ in the statement of the 
theorem. 


In the proof of Theorem 5-12 a function F(x) was constructed on the interval 
a,b] which had the property that F(a) = F(6) =0 and, in addition, 
satisfied the other conditions of Rolle’s theorem. Repeat the proof of Theorem 
5-12, but this time with the requirement that F(a) = F(b) = Καὶ, where Καὶ iS 
an arbitrary non-zero constant. 


The following four problems illustrate how the mean value theorem may be 
used to estimate the behaviour of functions in closed intervals. 


5°33 


5°34 


5°35 


Let f(x) be a differentiable function having a monotonic increasing derivative 
in the interval [a, b]. Then by writing the mean value theorem in the form 
f(b) = f{@ + ὦ — af, with a < ξ < δ, prove that f(a) + (x — α) (ὦ) 
< f(x) < [ὦ + (x — af), for a< x < ὃ. We shall agree to say that 
these inequalities define upper and lower estimates of f(x) in [a, 6]. Show 
also that if f’(x) is monotonic decreasing, then the inequalities must be 
reversed in the above expression. 


Apply the result of Problem 5-33 to the function f(x) = sin x in the interval 
[0, 47] in order to prove that 0 < sin x/x < 1 ἴογ <x < ἐπ. 


Apply the result of Problem 5-33 to the function f(x) = (1 + x?)°/? in the 
interval [1,2], thereby obtaining upper and lower estimates for it in that 
interval. 


5.36 If f(x) = 1+ x + (1/5) sin? x, show that f(x) is monotonic increasing in 


5:37 


5:38 


the interval [--- ἔπ, 47]. Hence apply the result of Problem 5.33 to f(x) to 
obtain upper and lower estimates for f(x) in that interval. Evaluate the 
inequalities for x = Ὁ and x = ἐπ and compare the estimates with the exact 
result. 


Let the functions f(x) and g(x) be continuous in [a, b} and differentiable in 
(a, δ), with g(x) non-zero in (a, δ). Show that under these conditions Rolle’s 
theorem may be applied to the function F(x) defined by F(x) = fl@g(@) — f(b) 
g@ + [φ(ο) -- gO@If@ -- [f(a) — f()]g(x), fora <x <b. Hence estab- 
lish the Cauchy extended mean value theorem. 


By repeatedly applying L’Hospital’s rule where necessary, evaluate the 
following indeterminate forms of the type 0/0: 


tan ox x cos x — sin x. 


a) lim ——; b) lim 2 
(a) a (b) ἐπὶ xa 
tan x — sin x . x8— 2x? —x +2 
im ------------; d) lim ------  -----..-- 
ω Ων x— sinx oo x? — 7x + 6 
. x? — sin? x 
Θ zie! x2 sin? x 
5-39 Evaluate the following indeterminate forms which are of the type 2/00: 
(a) lim (7/x)/cot x/2; (b) im tan x/tan 5x; 
x-—>0 2h 
3x2? ++ x—1 cot x 
. sxe + x — 1, lim —C*_. 
©) bees x24+2 ” (d) eee x —cotx 
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5.40 Explain the fallacy in this argument. The limit 


_ x24 xsin x + sinx 
1) ---- -οε------ 
x—> 00 x 
does not exist because, applying Corollary 5-14 to L’Hospital’s rule gives 


. x? + xsinx + sinx . 2x + sinx + xcosx + cos x 
lim VT -  Ξ in -------- ὦ -’ ὁ 
χοῦ x —> 00 2x 
; sin x + cos x : 
= lim | 1 + $cos x + —————-_ = 1 + 3 lim cos x, 
T— CD 2x za © 


What is the true value of this limit? 


5.41 Indeterminate limits of the form οὐ — οὐ, 0. οὐ can be reduced to the types 
0/0 or «o/0o by means of the following simple devices. If the limit is of the 
type 0. οὦ set lim f(x) = 0 and lim g(x) > οὐ, then 

wa 


lim [f(x) g(x)] = lim [f@)//e@)] (type 0/0) 
sti [50 )] (1 ,{Χ}}} (type οὐ Ὁ). 
If the limit is of the type 00 --- οὐ set lim f(x) = 0, lim g(x) = 0, then 
te eit ΒΕ Εἰ τεῦ 
τὸ πὸ 7 τι ἷ fo) g@) eee! 


_ 1 Uf) eg) ee 
a Rigs = ΠΝ ype 12). 


Apply these results to evaluate the following limits: 


1 Ly. 1 5 
(@) tim (S55) ὦ lim (3 τ ws): 
: ᾿ ὦ 9 | 
(c) lim (1 — cos x) cot x; (4) lim x sin -; | 
x—0 το χ 
πὰ x 7 
i -- =; f) lim {—— -- 
(e) a Sc si Ζ oe cae [ὠ χ  2cos . 
5.42 Verify the nature of the extrema in Problem 5-20 using the results of Theorem 
5.15. 
5.43 Verify the nature of the extrema in Problem 5-23 using the results of Theorem 
5:15. 


5:44 Apply to Problem 5:26 the modification to Theorem 5-15 indicated at the : 
end of Example 5-10 (b) to identify the behaviour of the function at the origin. 


5.45 Apply Theorem 5-15 to Problem 5-26 to identify the extrema occurring in 
the interval (0, 5]. 


5.46 If y = f(x), where fis a differentiable function, find the differential dy given 
that: 
(a) f(x) = x8 + 3x2 + x + 6; 
(b) f(x) = x sin (x? + 1); 
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T+ x: 


(d) f(x) =  ( + x2)¥2. 


5.47 Metals A and B have coefficients of linear expansion « and β, respectively. 


That is to say, when the temperature changes by an amount ¢ from the 
ambient value To, the linear dimensions of metal A change by a factor 
(1 + af), whilst those of metal B change by a factor (1 + £r). Suppose that a 
block of metal A contains a cylindrical cavity of height Ho and radius Ro at 
temperature ΤῸ which is empty apart from a cylinder of metal B which has 
height Ao and radius ro at that same temperature. Obtain an approximate 
expression for the small volume change dV of the cavity between the cylinders 
consequent upon a small change of temperature dz. 


Section 5-4 
5:48 Compute the first and second derivatives of the functions [ΟἹ listed below: 


3°49 


5:50 


5.51 


5:52 


5:53 


(a) f(x) = tan x; 

(0) f(x) = x? sin x; 

(c) f(x) = (1 + x) sin x + cos 2x); 
(ὦ f(x) = @? + 1); 

(e) f(%) = sin (1 + x?*); 


1 
(Ὁ fla) = tan -- 


Show that if f(x) = 33x? — 1), then 
Cl — x?)f"(x) — ΖΓ ΟἹ + 6f(x) = 0. 


Equations of this type are called second order ordinary differential equations, 
and this one is a special case of Legendre’s differential equation. 


If f(x) = ἐ(.χ — 3x) and g(x) = 3(3x? — 1), find the algebraic equation 
connecting f’(x), σ΄), and f(x). 


Show that the function f(x) defined below is continuous and has a con- 
tinuous first derivative at x = 1, but that it has a discontinuous second de- 
rivative at that point: 


x4 x2 -— x + 1lforx <1 
f@) = 
2x3 — x2? +x #8 forx>1. 
Use Leibnitz’s theorem to evaluate the third derivatives of the following 
functions: 
x? 
@) f@) = 75 


(c) f(x) = sin? x; (d) f(x) = x8 sec 2x. 


Apply Theorems 5-17 and 5-18 to locate and identify the extrema and points 
of inflection of the following functions, using your results to determine the 
gradients at the points of inflection: 


(a) f(x) = 2x8 + 3x? — 12x 4+ 5; 
x3 

(0) f(x) = πὲ 3' 

(c) f(x) = χξὰ -- 12)?. 


(b) f(x) = (x? — 1) tan x; 


264 / DIFFERENTIATION OF FUNCTIONS CH 5 


5-54 Use the mean value theorem to prove that if f(x) has a maximum at x = xo, 
then near to xo, f’(x) < 0. Show that if f(x) has a minimum at x = xo, then 
near to xo, f° Ὁ) > 0. Hence show that these tests may be used to identify 
maxima and minima, even when f{’(xo) does not exist. 


5:55 Apply the results of Problem 5:54 to prove that the function f(x) = 
(3x — 1)x2/3 has a maximum at the origin. 


5.56 Determine the values of a and ὁ in order that f(x) = x? + ax? + bx + 1 
should have a point of inflection at x = 2 at which the gradient of the 
tangent to the graph is —3. 


Section 5°5 
5:57 Compute the derivatives fz and fy given that: 
(a) f(x, y) = x?/y; 
(b) f(x, y) = 3x2y + & + γ)ῦχ + 1; 
(Ὁ f(x, y) = sin (x? + y?); 
(4) f(x, y) = x cos (1--Ὲ x? y”). 


5:58 Given that 
f(x, y) = x38 + 3x2y + 4xy? + 2y8 
prove that xf: + yfy = 3f 
5:59 Compute the derivatives Ἂ fu fe given that: 


(a) f(x, y, z) = x®yz + —;5 


τσ’ 
(Ὁ) f(x, γ, 2) = x cos yz + y COS xz + Ζ 008 xy; 
(c) f(x, y, 2) = cos (x? + xy + yz). 

5-60 Show that if 

x 


LOY 2) Ξ Cap yay ae 
then xfz + yfy + 2f2 = —2f. 
5-61 Show that if 
IXY, = x + : 
thn ἐς εἰν Ἐ ΚΜ = 1. 
5.62 Show that if 
f= (x — yy — De — x) 
then fz + fy + fz = 9. 


Section 5-6 
5:63 Find the total differential du given that u = f(x, y, 2), where: 


wee: 


“.-.Ζ2 


(a) f(x, y, z) = πες Ὑ 592; 


(Ὁ) f(x, y, 2) = χ sin je + z?); 
(c) f(x, y, z) = UL - x? — y? — 25. 
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5:64 The speed of wave propagation u in a transmission line with inductance L 
and capacitance C is given by the equation u = (LC)-!/2, Relate the differ- 
ential du to the differentials dZ and dC. How must dZ and dC be related if 
u is to remain constant? 


5°65 Apply the triangle inequality |a+56|<|a|-+|6| to establish that, if 


u= f(x1, X2,. . ., Xn) is differentiable with respect to each of its independent 
variables x1, x2,. . ., Xn, then 
of of of 
<{|—||dx —|l|d oe — [| dxn |. 
|du| soe || eae χα + ἘΠῚ 1 Xn | 


A triangle with sides of length a, ὁ, c has area A = +/[s(s — a)(s — b)(s — Ο)], 
where 2s = a + ὁ + c is the perimeter. If s is kept constant, find the largest 
possible value that may be assumed by | dA |, the absolute value of the area 
differential dA, consequent upon changes in the differentials da, db, and de. 
Apply the result to an equilateral triangle in which a = ὁ = c = 4, when 
changes da = 0:01, db = 0-015, and dc = —0-025 are made. 


5°66 Compute dy/dx from the following implicit relationships: 
(a) x? ++ y? = 4; 
(b) x sin xy = 1; 
(c) x2y + 2xy? + γ8 = 2. 
5:67 Compute z/éx and 0z/@y given that: 
(a) x8 + y? + 22 = 1; 
(Ὁ) xyz + sin xz? = 2; 
(c) x2 — 2y? + 323 — pz + y =0; 
(ὦ x cosy + ycosz+ zcosx = 1. 


Section 5-7 
5°68 Find the envelope of the family of curves with parameter α΄ 
(x — ax)? + y? = «2/2, 
5:69 Find the envelope of the family of curves with parameter « 


3 
λα εν τοι 


5:70 When a particle is projected into the air with velocity V at an angle θ to the 


horizontal then, neglecting air resistance, its height y when distant x from the 
point of projection is given by 

τοῦδε τος 

2V2 cos? 6 

By regarding @ as a parameter, show that the envelope of the family of 


trajectories for 0 < @ < isa parabola, and find its equation. This is usually 
called the parabola of safety because no projectile can penetrate beyond it. 


y=xtan 6— 


5°71 Find the envelope of the family of curves with parameter « specified by 
(x — a)? + Ὁ + a)? — αΞ = 0, 
5°72 Show that the envelope of the family of curves with parameter « defined by 
xcos« + ysina = 2 
is a circle. Find its centre and radius. Interpret this family geometrically. 
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Section 5-8 
5:73 Find du/dt given that: 
(a) wu = xy + sin (x? + y?) with x = 27, y = (1 + £?)1/?; 
(b) wu = (1 + x? + y*)9/? with x = ΚΙ τ ἢ), y = 13; 


()u= with x = 3cost, y = 3sin#, z= ¢? 


Ζ 
(x2 + yy 
5.74 If u = x? — xy + y®, compute du/d¢ at points on the curve specified para- 
metrically by the equations x = 27 -Ἔ 1, τῷ 15-Ὁ 1 — 2. 


[Hint: Set ¢ = 2x2 + y?.] 
5.76 If u = f(x, y), compute du/dx given that: 
(a) f(x, y) = (1 + xy + x?) where y = tan (2): 
(Ὁ) f(x, y) = (1 + x? — y?)3/? where y = cos 3x; 
(c) f(x, y) = x cos y + ycos x — 1 where y = 1 + sin? x. 


5.77 If u = f(x, y) and g(x, y) = 0 are differentiable functions, compute du/dx 
given that: 
(a) f(x, y) = x8 + 3xy + γϑ and e(x, y) = x cos y + ycos x — 2; 
(b) f(x, y) = x*y? + sin xy and g(x, y) = x2 — 2y? — 3. 


3°78 Ifu = x? — xy + y?, determine du/dx at points on the ellipse 2x2 + 3y? = 1. 


Section 5-9 


P(r, 8, 9) 


=rcos @ 


—? 


x=rsin@cos@ 


y=rsin@ sin g 
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5°79 In spherical polar coordinates a point P is specified in space by giving the 
ordered number triple (r, φ, 6). Here r is the radial distance of P from the 
origin, is the azimuthal angle of P measured anti-clockwise from the 
x-axis in the (x, y)-plane, and @ is the acute angle between the radius vector 
drawn to P from the origin and the z-axis. (See Figure.) 


It is easily seen that: 

x = rsin θ cos φ, 

y=rsin θ sin φ, 

z=rcos 86. 
If f(x, y, 2) is differentiable with respect to x, y, and z, express δέ ὃν, δ, 
and éf/ep in terms of éf/ex, δύ ὃν, and éf/éz. Find their values given that 
Κα, y, 2) = χϑ + 2χν + yz + 22. 

5.80 Given that /(x, y, z) = x? + xy + sin yz, compute éf/ér, 2f/20, and Φ 682, 
where (r, 6, z’) are the cylindrical polar coordinates corresponding to the 
point (x, y, z). 

5°81 The notion of a Jacobian extends to transformations involving more than two 
variables. If, in Theorem 5:22, m =n = 3, the Jacobian or functional 
determinant is 

ὄχι Ox2 Oxs 
δαι Oa, Gay 


δίχι, χα, X3) _ | ὅχι χα xs 
O(a1, «2, «3) Gag Cag Cas 


Ox1 Oxe Ox: 

0a3 Ga3 Cag 
Evaluate the Jacobian δία, y, z)/Or, 9, 2’) for the transformation from 
Cartesian to cylindrical polar coordinates. 


5.82 Use the definition in Problem 5-81 to evaluate the Jacobian a(x, y, z)/ 
ar, φ, 9) for the transformation from Cartesian to spherical polar co- 
ordinates. 


5°83 Find the Jacobians of the following transformations, stating where, if at all, 
they vanish: 
(a) x = 2u+ 30+ 1, y= 3u—- 2υ --- 1; 
(b) x =u? — v2, y = y2 + νυ; 
()x= w+ 2μυ -Ἐ υϑ, γ τε μ. 
5.84 Use Theorem 5:22 with n = 2, m = 3 to determine af/éu, δύ ὃυ, and af/aw, 
given that: 
fatty? 
where x = u2 + v + wand y = wow. 
5.85 If u and v are functions of x and y which satisfy u2 — v? + 2x + 3y = Oand 
uv +x — y = 0, find du/ox, eu/ey, dv/éx, and @v/2y in terms of u.and v. 
5°86 Prove that if z = ἔζη, v), where u = x + 34,0 = y — 21, then 
oz az OZ 
δὶ ax Oy 
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5:87 Show that if u = 1/r", where r? = x? + y? 4+ 2%, then 
Ou Ou δὲμ n(n — 1) 


x2" ay? Gz2 pnt 
5°88 Prove that if u = 2xy + xf(y/x), then 
Ou 


“ 
Section 5:10 
5:89 Which of the following implicit functions /(x, y) = 0 may be solved explicitly 
for y in the neighbourhood of the stated points (xo, yo): 
(a) f(x, y) = x? + y? + xy — 11 at C1, 2); 
(0) f(x, y) = ( — x? — y?)l? at (—1, 0); 
(c) f(x, y) = sin xy — 1 at (1, 37); 
(ὦ f(x, y) = y + sin xy — 2 at Gz, 1)? 
5.90 Compute dx/dy for each of the following relationships: 
(a) y=1+4 x27 + xsin x; 
(0) y= (1 — x + x9)?; 
(c) y=x-+ tan x. 


Cu 
ἜΣ πως 


5.91 Differentiate these functions: 
(a) f(x) = x? arc sec (x/a); 
(b) f(x) = (x? + x + 1)/arc sin (x? — 2); 
(c) f(x) = (1 + x + are cos 2x)9/2, 
5:92 Compute dy/dx and d*y/dx? for each of the following parametrically defined 
curves: 
(8) x=t—l, y=; 
(Ὁ) x = cos? t, y = 2 sin’ f; | 


(c) x = arccos ΔΙῸ sin 


1 ΐ 

να} Va Ὁ τ᾽ 
(4) x = 2Acost + ¢tsint), y = 2(sin t — tf cos ἢ). 

5:93 Compute dy/dx and d*y/dx? at t = ἐπ ifx = ¢ — sintand y = 2(1 — cos?). 

5:94 Compute d’y/dx? when ¢t = 1, given that x = 2¢ + 1, y = ¢(1 + £2). 

5:95 In Example 5:21, an envelope is specified in terms of a parameter «, and it 
comprises two curves corresponding to the + and — signs associated with y. 


Find the gradient of each of these curves at the origin (that is, corresponding 
to « = 0). 


Section 5-11 

5.96 Compute 822/0x?, 0°z/Oxéy, 0°z/eyex, and €%z/éy? for each of the following 
functions and hence show that 02z/éxdy = 62z/dyéex: 
(a) z = (x? + y®)2; 
(b) z= xcos y + yCos x; 
(c) z = arc tan (y/x). 

5:97 Compute fzz(1, 1), fey(1, 1), and fy,(1, 1) given that 

S(x,y) = ( + x)A1 + = y)?. 
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Is 62f/Oxdy = 02f/dyax? Give reasons for your answer. 
5:98 Given that 
| aie ΑΒ for x #0, y #0 
γα) = [EEO 
1 forx=0,y=0 


compute 6//0xéy stating, with reasons, when it is equal to 0°//éyéx. Is there 
any point at which this result is not true and, if so, what property of the 


function invalidates the result? [Hint: Consider limits taken along the line 
y= mx.] 


5.99 Show that if w = arc tan (x/y), then 02w/ox? + d?w/oy® + a2w/dz? = 0. 
5.100 Given that V = arc tan Petes — y*), prove that 


OV Gas 


(a) x= τι τ = 0; (i ἜΝ oe = 0. 


5101 Compute 0%z/éxéy? and 0%z/éx?éy given that z = x*y? + sin x?y. 


Exponential, hyperbolic, 
and logarithmic functions 


6:1 The exponential function 


This chapter will be concerned primarily with the exponential function, 
first introduced in connection with limits in Section 3-3 and, thereafter, with 
a number of related functions. This time our approach will be to utilize both 
geometrical ideas and the elementary calculus to produce a more useful form 
of definition than that contained in Eqn (3-6). 

Let us seek a function E(x) equal to its own derivative and such that 
E(O) = 1. Specifically, we must solve the equation 


E"(x) = E(x) (6:1) 


which, because it involves the unknown function E(x) together with its 
derivative, is called a differential equation. This differential equation has the 
following simple geometrical interpretation: if the graph of the function E(x) 
is drawn, then the gradient of the graph at the point (x, E(x)) is equal to the 
functional value of E(x) itself. 

Perhaps it is worth remarking that Eqn (6-1), taken together with the 
condition E(0O) = 1, immediately implies that E(x) is a convex function for 
x = 0. No deduction can yet be made about its behaviour for x < 0 though, 
in fact, we shall shortly prove that E(x) is a convex function for all x. 

As on previous occasions, our desired result is soonest obtained by 
studying an artificial function. The reason for considering the precise form 
of function to be used will become apparent once the result has been obtained. 

Suppose, for the moment, that there is a unique function E(x) defined by 
our requirements, and consider the new function F(x), where 


F(x) = E(x)E(a — x). (6:2) 
Then, 
F(x) = EQ) ΤΕ — ¥)] + Bla) ἐπ Bo) 
x x 


which, using the defining property (6-1), becomes 
F’(x) = —E(x)E(a — x) + E(a — x)E(x) = 0. 


Consequently, F(x) = constant but, as 110) = E(O)E(a) = E(a), it follows 
at once that F(x) = F(0) = E(a) for all x, and thus Eqn (6:2) takes the form 


E(x)E(a — x) = E(a). 


ω «ὦ 
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Alternatively, by replacing a by a + ὃ and x by ὁ this may be written 


E(a + δ) = E@E(0). (6:3) 
Hence, if m is a positive integer, 
E(n) = E(n — 1)E(1) = Em — 2)(E())? = + - + = (EC). (6-4). 
If, now, we denote E(1) by the symbol e, then Eqn (6-4) is equivalent to 
E(n) = e". (6:5) 
The fact that E(0) = 1 taken together with Eqn (6-1) implies E(1) > 1, 
also implies, via Eqn (6.5), that lim εἴ -- 00, © 
Again, 
E(—n)E(n) = E(0) = 1, 
so that | 
E(—n) = eet ee 6:5 66 
( n)= E(n) ae en cae : ( ) 


Now we must extend this notation to take account of rational and 
irrational x. Let us consider E(x) for rational x, so that x = p/q with p,q 
integers. Then, using Eqn (6:5), we may write 


(= #(8) τῶ τυ 


and so 
E (?) = e/a, (6-7) 
q 
A similar argument using Eqn (6-6) shows that 
E (=) = e-P/G, (68) 
4 | | 
Thus we have shown that for all rational x 
E(x) = e*. | (6:9) 


To extend the definition of E(x) to all the real numbers x and not just to the 
rationals, it only remains to add that for any irrational number &, we define 
E(é) by the equation (ὦ) = εἰ. 

Although the foregoing arguments have established the algebraic properties 
of E(x), they have still not provided a method of attributing an actual number 
to E(x) for any given value of x. Nor, indeed, are we certain that only one 
function E(x) exists that satisfies Eqn (6-1) and is such that E(0) = 1; that 
is to say, is E(x) unique? This question will be answered in the affirmative 
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immediately following the next stage of our argument. 
We now Seek a series solution to our function E(x) of the form 


y= 2 Arxt (6:10) 
where, for simplicity, we have set y = E(x) so that Eqn (6-1) now becomes 

dy 

ay, Ν 

ae (6:11) 
with y(0) = 1. 

Assuming that this infinite series may be differentiated termwise, we have 

d 

= = > rarxt-1, 


so that substituting for y and dy/dx in Eqn (6-11) yields 
Σ γα, ΧΙ Ἰ τα Σ rx? 
r=0 


or, equivalently, 
Dr + Darsax! Ξε Σ ax’. (6-12) 


For this result to be unconditionally true for all x, as it must be to satisfy 
our definition of E(x), it follows that it must be an identity in x. This can 
only be possible if the coefficients of the corresponding powers of x on each 
side of Eqn (6-12) are identical. Hence, equating the coefficients of the 
general term involving x’, we find that 


(r + γάρ = Or (6:13) 
ἴοῦ γΞεύ,1,2,.... 


As we require that y(0) = 1, it follows by setting x = 0 in Eqn (6-10) 
that a9 = 1. Using this result together with Eqn (6-13), which defines the 
coefficients a, recursively, it is easily seen that 


1 1 
ap B= 3p: 


Substitution of these coefficients into Eqn (6-10) then shows that 


ὧρ ΞΞ 1, αἱ Ξε ἴ, 42 Ξξ 


xn | 
E@)=l+xt het: aE a a (6:14) 


whatever this expression may mean. 

We have already remarked that the sum of an infinite series is to be 
interpreted as the limit of the partial sums of the series, so let us now consider 
the mth partial sum 
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Sn = 1 = = 6:15 
n=l1+x+ 51 ΗΝ 7 a—1! (6:15) 
of the function E(x). 

If x > O then Spit — Sy = x®/n! > 0, so that {S;,} 15 increasing. Is {Sp} 
bounded ? Let Καὶ be an integer greater than 2x, then x/r < 3 forr > Καὶ, and so 


τ τυ. Αἱ Sei ($)"-R+1, 
ri 1 2 R—1 R Se aoa 
Thus 
R-1 yr π-ὶ yr 
Sn = s, ΤΙ + ΣΙ < Sr +a D! 2 Σ ()- -Εἰ: 
χε 1. ,[{-- 4φ)κπκ yR-1 
= Set τι: (Foy) <8 + eo 


which shows that {S,} is bounded. Hence by the postulate of Section 3:2 it 
follows that lim S, exists, and we now define the sum of the infinite series 


ΉΏ- Ὁ 
(6:14) to be equal to the value of this limit. The infinite series (6-14) is thus 
defined for all positive x. 
As we have agreed to write E(1) =e, it follows from Eqn (6:14), by 
setting x = 1, that 


1 1 
e=1t1+5 ἜΣ freee popes, (6:16) 
3! πὶ | 


which, to 15 decimal places, has the numerical value 
= 2°718281828459045. 


A modified argument shows that E(x) is also defined for all negative x, 
so that taking account of Eqn (6-9) we have proved the following result: 


THEOREM 6:1 (exponential theorem) For all x it is true that if 


a 
e= ὩΣ, oT 
then 
co % 
et = 2 a. 


Let us now dispel any lingering doubts there may be about the uniqueness 
of οὖ. Suppose there is a different solution z = E(x) of Eqn (6:1), with 
z(0) = 1. Then we must have 
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=e 6:17 

ἘΞ αν (617) 
and so, differencing Eqns (6:11) and (6-17), it is easily shown that 

SY ap 6:18 

dx = ee ( ) 


where w= y—z. We also have w(0) = y(0) — z(0) = 0. Now solving 
Eqn (6°18) by the same device as before, but this time setting 


w= > διχῦ, (6-19) 
r= 


we arrive at the recurrence relation 


(r + 1)δ...1 = br, (6-20) 


forr = 0,1,2,.. ., which is strictly analogous to Eqn (6°13). 

However, setting x = 0 in Eqn (6:19) and using the condition w(0) = 0 
we find that bp = 0, and so it follows from Eqn (6-20) that all the coefficients 
by are zero. Hence from Eqn (6-19) we see that w(x) = 0, and thus y = z, 
showing that the function e* defined by Eqn (6-14) is unique. 

Finally, it remains for us to establish the equivalence of the function 
E(x) defined by Eqn (3-6) and the one denoted by the same symbols in Eqn 
(6:14). We shall only give the details for positive x. Our best method is first 
to expand Eqn (3-6), obtaining 


[τ 3] =1+x +e (=) +e (FY + -+(2)" 


Then, setting En+1 = [1 + (x/n)]", we rewrite the result in the form 


1 xf 1 1 2 
Fea ltx+5(1 --) +5 ( --)( —=) τοῦ] 


+8 D022) θὸ 


Defining the number g(r, 1) by 


ced =(1-2)(1-3)--(t-3 


we next write Eqn (6-21) as 


x2 x3 xn 
Envi = 1 +x +5 8(l 0) + 38) +:-° + bln — 1,7). (6:22) 


Now the difference Sn+1 — En+1 1S 
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Suit — Ens = (1 — gn) ἘΞ α - g(2, n)) + 


ἘΠῚ ΤΙ το κί -- 1,1) 


which is obviously positive since 0 < g(r, n) < 1. 
However, it is readily seen that for any given r 


lim g(r, n) = 1, 


showing that 


lim (Sni1 — κει) = 0 


n—> © 


From Theorem 3-1 (a) it then follows that 


lim Eni = lim Sy41 = οὖ, 

n> © ti—> © 
thereby establishing the equivalence of our two alternative definitions when 
x is positive. A similar argument also establishes the equivalence when x is 
negative. 

Having now achieved a working definition for E(x) we shall henceforth 
always denote this function, known as the exponential function, either by e* 
or by exp (x). 

It is worth formally recording the differentiability properties of this 
function εὖ, However, we first remark that if f(x) = e97), where g(x) is a 
differentiable function of x, then, setting g(x) =u so that f(x) = οἴ and 
using the chain rule in the form displayed in Eqn (5-6), we find that 


df df du 
— = τ. — = ello’ =o’ g(x) 
dx dw de 8 OD) HE Oe 


THEOREM 6.2 If f(x) = e9@), where g(x) is a differentiable function of x, 
then 


* {eg} — σ΄ (α)εσῶ)͵ 
x 
In particular, if g(x) = «x, where « is a constant, then, 


d 
——(C"") == oer”, 
τ, (99 

Let us now establish an important property of οὖ. Consider the quotient 
οὔ ΧΡ, where p is any positive integer. Then from Eqn (6-14) it follows that, 
xP ΧΡῚῚ 


x? 


xP xP (p +1)! 
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Hence we have shown that 


a ee ; x 
lim — > lim -------- © 
> 00 xP 3-- CO (p + 1)! 


We have proved the following result: 


THEOREM 6.3 The function εὖ increases more quickly than any positive 
power of x as x > ©. 


We have already noted that lim 67 --» οὐ, and as εὖ = 1/e™ it follows 


> D 
that lim εὖ =0 or, equivalently, lim e~* = 0. From Theorem 6.1 it follows 
xu—> — 0 rt 2 


that the function e* is everywhere positive and since, by virtue of its definition, 
its derivative is everywhere a strictly monotonic increasing function of x it 
must be a convex function. A graph of εὖ is shown in Fig. 6-1. 

These last properties are frequently of help when studying limiting prob- 
lems involving the exponential function, as illustrated in the following 
examples. 


=). =I 0 


Fig. 61 The exponential function. 


Example 61 Deduce the values of the following limits: 
im 1: 
go 267 + xt ᾿ 
26% x? ΕΖ. 


ON ae 


ear — edz 
, 
(c) pie ax 
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Solution (a) We have 
36 eT 3 + (x3/e*) + (1/e”) 
267 + x? 2 + (x?/e*) 
and from Theorem 6:3 it then follows that all but the initial terms in numera- 
tor and denominator must vanish as x —> οὐ, so that 
3e7 + x3 + I _ 3 


πὶ Ὡς: t+ x7? Ὁ 


(b) In this case we have 


26 Ὁ + x2 +2 Le + (x7/e) + (2633) 
3087 447 3 + (7/e8*) 
However, this time as x —> 00 so all the numerator tends to zero whilst the 
denominator approaches the value 3. Hence we have 


i 2e%% + x2 +2 0 
im ----------- = 0. 
gro 3685 4+ 7 
(c) This limit involves an indeterminate form of the type 0/0, so we 
appeal to Theorem 5:14. Writing f(x) = e¢” — e* and g(x) = 2x we see that 
FO) = g(0) = 0, and 
τὰ ὡς, ποτ A rs 
2—0 g (x) x—+>0} 2 2 


Hence, by the conditions of Theorem 5-14, 


i eer paar, eer . Qeet a freer ar b 
im ---------- = ΠῚ ------.-.--- -- . 
x->0 2X z—>0 2 2 


6:2 Differentiation of functions involving the 
exponential function 


The exponential function occurs frequently in mathematics, and all of its 
differentiability properties follow from Theorem 6:2 combined with the 
fundamental differentiation theorems of Chapter 5. These results are straight- 
forward and are best illustrated by examples. The first example illustrates the 
ordinary differentiation of simple combinations of functions. 


Example 6.2 Differentiate the following functions f(x): 
(a) f(x) = 2x? + 3%; (ὦ f(x) = e7/(1 + οὔ); 


(Ὁ) f(x) = x?e; (e) f(x) = sin (1 + e?). 
(c) f(x) = 2 exp (x8 + 2x + 1); 
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Solution 
(a) Γ[) = s (2x2 + 3e2%) = 4x - 3 Ξ (e27) 
dx dx 
and so 
S'(x) = 4 + 60%. 
(Ὁ) ΧΟ) = ἘΞ “ (x?) + ὧν oe “ (e%) 
so that 
[ΟἹ = 2xe8% + 3x28, 


(c) This is a more complicated example of a composite function or, more 
simply, of a function of a function. Set u = x3 + 2x + 1 so that 


F(x) = 2e*. 
Then, by the chain rule, 


df du 
POO = ας 
but 
T= © (20) = Dev = 2exp (x8 + 2x + 1) and ἐπ = 3xt $2 


so that, finally, 
7 Οὐ = (6x? + 4) exp (x3 + 2x + 1). 
(d) Writing f(x) in the form 
I) Ξ- e22(] + et)-1 
we have 
oy = Fey Sey Ὁ oS - φ): 
dx dx 
or 


2e%* a x\— 
LO ey qx (1 + 65 ἩΝ 


To evaluate the last term set 1 + οὖ = u, so that we then need to evaluate 


ie (i) 


which, by the chain rule, is 
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d ( _d ( | du 

dx\u} du\u) dx 
However, du/dx = e* and (d/du)(I/u) = —(1/u?) = —1/(1 + e*), showing 
that 


-1 ae 
= κα +e) ee τῆς 
Hence, eer our results, we find 
est 
70) Ξ 


eas + πὶ “( - et)? 
(6) This is another composite function. Set u = | + e?%, so that f(x) 
== sin u. Proceeding as before we then see that 


f'(®) == -- = 2e7 cos : + 629). 


Higher order derivatives are defined, as usual, by repeating the differentia- 
tion process the requisite number of times. 


Example 6.3 Find f(x), given that: 
(a) f(x) = x?e-?*; 
(b) f(x) = (x — Det. 


Solution (a) Proceeding as before we find that 

f(x) = 2xe-2% — 2x%e—22, 
and 

SF '(X) = 2072 — ἀχε 35 — 4xe—2t 4 4x2e-20, 
Collecting terms we obtain 

f(x) = 201 — 4x + 2x?)e—2, 

(Ὁ) ΚΟ = e% + (x — Det = χοῦ 
so that 

[Οὐ = εὖ + xe*. 

Partial differentiation of functions involving the exponential function is 
also straightforward, as the following example indicates. 
Example 6-4 Determine /;, fy, and fry, given that 

TX Y) = (x? + y?) exp (x? — y?). 


280 / EXPONENTIAL, HYPERBOLIC, AND LOGARITHMIC FUNCTIONS CH 6 


Solution 


of _ 2 y2 2 2 ἦ 2 2 
ὃς Ξ OX EXP (x? — y?) + (x? + γῆ = [exp (x? — "Ὁ 


= 2x exp (x? — y?) + 2x(x? + y?) exp (x? — y?). 
Notice that of/éx comprises the sum of everywhere continuous functions 
and so is itself everywhere continuous. 


Of 2 2 2 2 ὃ 2 2 
Ge ee ae Ἔν saa) 


= 2y exp (x? — y?) — 2y(x? + y?) exp (x? — γ9. 


The partial derivative δῇ ὃν is also seen to be everywhere continuous. Theorem | 
5-24 now tells us that off/exdy = afféyex, so that we may differentiate either 
Ofjex or ἤν to arrive at ἔων. We choose to differentiate ἔς partially with 
respect to y. 


63 
ΤΣ άχγ exp (x y?) + 4xy exp (x y?) 
— 4xy(x? + y?) exp (x? — y?), 
whence 
Of Of _ ΣΝ ἢ 2... 2 
ἜΣ Τα Axy(x? + y?) exp (x? -- y?). 


As a final illustration, let us consider an application of Theorem 5-21 to 
the exponential function. 
Example 6:5 Find d//dt, given that 

F(x; y) = xy exp (x? + 3y + 1), 
with x = βίη ἡ, y = ¢3 + 1. 


Solution Here we must use the chain rule formula for partial differentiation: 


df_ of dx of dy 


dt ax dt ὃν dt 
Now 

of ὃ 

By } EXP CO? + 3ν + 1) + xy = exp (x? + 3y + 1) 


= y exp (x? + 3y + 1) + 2x2y exp (x? + 3y + 1), 
and thus 
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of = y(l + 2x?) exp (x2 + 3y + 1). 
Similarly, 
of ὃ 
ay = xexp (x? + 3y +1) + ars td + 3y + 1) 
= x exp (x? + 3y + 1) + 3xp exp (x2 + 3y + 1), 
and thus 


ο 
= = x(1 + 3y) exp (x? + 3y + 1). 


We also have 


d d 
= = cos ft and = == 313, 
and so df/dt may now be found by direct substitution into the chain rule 
formula, with the following result: 


ΤΙΝ [(¢2 + 1).1 2 sin? 2) cos ¢ + 322(328 + 4) sin ἢ 
x exp (4 + 318 + sin? 2). 


6:3 The logarithmic function 


Having introduced the exponential function there is now a need for an inverse 
function. The implicit function theorem (Theorem 5-23) tells us that such an 
inverse function exists and, furthermore, that it is differentiable whenever 
(d/dx)(e*) 0. However, this is always the case since we have already seen 
that (d/dx)(e%) = e*, which is never zero for x in the interval — οὐ < x < ©. 
Hence a differentiable function, inverse to the exponential function, exists for 
all x. We call it the natural logarithmic function and denote it by loge whenever. 
it is necessary to indicate that it has the base e. 


DEFINITION 6:1 We define the natural logarithmic function loge x by the 
requirement that 
y=logexox=e. 


We may use this definition, together with Corollary 1 to the implicit 
function theorem, to compute the derivative of loge x. As dy/dx = 1/(dx/dy) 
and x = οὕ», it follows that dx/dy = ον, whence 


282 / EXPONENTIAL, HYPERBOLIC, AND LOGARITHMIC FUNCTIONS CH 6 


Now οὖ is essentially positive, so that 
- (1 ) Ξξ : f 0 
qx lobe) = - or x > 0. (6-23) 


It is obvious that loge 1 = 0 and, as x increases strictly monotonically 
with y, it also follows that loge x — + οὐ as x —> - οὐ, and loge x > — 0 as 
x —> 0. 

Let us now prove that 


loge X _ 


lim = 0 for all « > 0. 


ὕπνου 


α 


As x = e¥ we have 


logex γ 
x ~~ ety 
and so 
᾿ loge y 1 ay 
χω Χ y— οὦ οὖν α yoo ον 


Setting u = ay we arrive at 
. logex 1... u 
ii es lim — = 0, 
ew 0c ἊΣ 04 ua C et 


by virtue of Theorem 6:3. 
Collecting the previous results we arrive at the following theorem. 


THEOREM 64 If y = loge x, then 


1 
Cy ene 
dx x 


Be * _ Ὁ for all a> 0. 


(b) lim 
χορὸ xX 


Logarithms to other bases can be used if convenient. They are defined as 
follows. 


DEFINITION 6:2 We define the logarithmic function to the base c, denoted 
by loge x where Ὁ is a positive number, by the requirement that 


yp=logexo~x=c%, 


For reference purposes we record the following familiar properties of the 
logarithmic function, established in elementary courses. 
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Basic properties of the logarithmic function 


Let loge and loge represent logarithms to the bases e and c respectively, and 
a, b, r be real numbers; then: 


(a) loge ab = loge a + loge 5; 
(Ὁ) loge α΄ = r loge a; 


loge a 
(c) loge a= logec’ 
(d) logee = ions = 


Results (c) and (d) quoted above are immediately useful if it is necessary 
to differentiate log, x. For we have 


loge x 
loga x = oe 
so that 
= (loge x) © (loge x) 
dx logea dx 
whence, 
= (loga x) = > oe ‘ies “ts -. (6:24) 


Let us now find the derivative of the function a”, where a is any positive 
number. Notice first that, by. virtue of Definition 6-1, 


a= eloge a 


so that 


Qt = (else ae = οὗ loge a 


Now loge Ω is simply a constant, so we have 


d d 
Fie (a*) _ = (e* loge 4) ae loge get lee — gt loge a. 
We have thus established the useful result 
d 
ΕἿΣ (at) = a® loge a. | (6.25) 


This result can also be obtained in another manner. We set 
y= a’, 


so that taking the natural logarithm gives 
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loge y = X loge a. 


Differentiating this result with respect to x we obtain 


d d 
ae (loge y) = ae (x loge a) 


or 
oe = loge a 
y d = 10Be a, 
and so 
dy d 
.-2. = — (gt) = = 9% 
Pape (a*) = y loge a = αἴ loge a. 


For our final general result we consider the differentiation of the function 
y = loge g(x), where g(x) is a differentiable function. Setting u = g(x) so 
that y = loge uw and using the chain rule gives 


ἂν dy du _ ] 


50 ἣν finally, 


= loge g(x)] = 8 ἐς τ (6-26) 


Henceforth, unless otherwise stated, the natural logarithm will always be 
used, so for simplicity of notation we shall write log in place of loge. Often, in 
other texts, the notation In is used to denote the natural logarithmic function. 

Let us now examine some representative cases of limits involving 
logarithms. 


Example 6-6 Evaluate the following limits: 
log x3 
(a) lim “ἘΣ : 
x 


tw © 


. loga® =. 
(b) am πε with a > 0; 
_ 1+ x3 log [2 + (1/x)] . 
νον σαν τ ΤΥ ΕΣ tae 
Ι 
ἀν τ τ 
2x. 


z—0 
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Solution (a) We have 
logx* 3 log x 


x x 
so that by Theorem 6-4 (b) it follows at once that 
log x3 


lim = 
χ-- co XxX 


(b) We have 
loga® = x loga 
3x 1 3x41 


and so 


log αὖ ΒΝ χίορα 1 


lim = 3 log a. 


Zope eset 
(c) Using the result 
1 + x3 log [2 + (I/x)} _ (1/x3) + log [2 + (ἢ 
3x3 + 2x2 + 1 3 + (2/x) + C1/x3) 
it is at once apparent that 


_ Ll+x%log[2+(U/x] 1 
lim ----.-- σ΄ π΄’ = - 2. 
ἐπὲν 3x3 + 2x2 + 1 mgt S 


(d) This is an indeterminate form of the type 0/0. It is easily verified that 
Theorem 5:14 (L’Hospital’s rule) is applicable so that 


_ 20) Ζῶ 
eG) eG) 


with f(x) = log (1 + 3x) and g(x) = 2x. As f’(x) = 3/(1 + 3x) and g’(x) 
= 2 it thus follows that 

. log (1 + 3x) 3 3 

lim —-————— = lim ——_ = =. 

x0 2x zo 21 + 3x) 2 


Example 6.7 Determine the derivative dy/dx for each of the following 
functions y = f(x) where: 

(a) f(x) = log (3x2 + 2); 

(b) f(x) = log tan 2x; 

(c) f(x) = 3*x?; 

(4) f(x) = Gin x). 
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Solution (a) Here we must apply Eqn (6:26), with g(x) = 3x2 + 2. As 
g(x) = 6x it follows at once that 


6x 


d 
— ἢ 3x2 + 2)| = -------. 
το Hog (3. + 2)} Ie Ἐ 2 


τς (b) Again we must use Eqn (6.26), but this time with g(x) = tan 2x. 
As g’(x) = 2 sec? 2x, we have that 


d 2 sec? 2 
ἮΝ (log tan 2x) = ———— = 2 sec 2x . cosec 2x. 


(c) We have 
d d d 
.- (3%y2) — 37 —_ 2 2.-. (32 
qx re) qu ot) + 7 =) 
which, by virtue of Eqn (6-25), becomes 
Φ (3:7χ3) = 2x . 37 + x37 log 3 
χ 
giving 
d 
re (37x?) = (2x + x? log 3)3*. 
x 
(d) We set y = (sin x)* and take Jogarithms to get 


log y = x log sin x. 


Now, differentiating, we find that 


d 
ΠΩΣ = logsinx + x ἐς (log sin α) 
or 


dy 


— = (sin x)*(log sin x + x cot x). 
dx 


Partial differentiation involving the logarithmic function is equally 
straightforward. The final example illustrates a typical situation. 


Example 6-8 If u = x log [1 + (x/y)] + y log [1 + (y/%)], show that 
Ou Ou 


Solution We start by computing Gu/éx. It is readily seen that 
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a ( εἶ). τ [ +*)4 ΠῚ 149) 
éx γ᾽ ΝΣ y "ex - ( x 


[ ἘΠ -Ἄ (=) 
SEES τ Page 19] 


*) x y? 

+ a ee eS δξξσξος 

yh x+y χα ty) 

The symmetry of x and y in u then allows us to interchange x and y in the 


above partial derivative in order to derive @u/éy without further calculation. 
We obtain 


ou y y x? 
m= log (1+) + ca ice eras 


x+y Wet y) 
Hereafter, direct substitution verifies that 
Ou ΓΙ Ou 
x— —=u 
Ox 4 ey 


6-4 Hyperbolic functions | 


It is useful to define new functions called the hyperbolic sine, written sinh x, 
and the hyperbolic cosine, written cosh x, which are related to the exponential 
function. This is achieved as follows. 


DEFINITION 6:3 (hyperbolic functions) For all real x we define sinh x and 
cosh x by the requirement that 


sinh x = a cosh x = a 
It is an immediate consequence of the series for εὖ and e~* that 
x3 x? x? x2n+1 
heats hath ogaaye ee (6:27) 
and 
2 4 6 2n 
coh Sg tap et tee pte (6-28) 


Furthermore, it also follows from Definition 6-3 that sinh x is an odd function 
and cosh x is an even function. 

We now define the Ayperbolic tangent, cotangent, cosecant, and secant, 
denoted by tanh x, coth x, cosech x, and sech x, as follows. 
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DEFINITION 6°4 


sinh x cosh x 
tanh x = : coth x = -- : 
cosh x sinh x 
cosech x = — : sech xX = . 
sinh x cosh x 


We illustrate how useful identities may be established directly from 
Definition 6-3. Let us prove that 


sinh a cosh b + cosha sinh ὁ = sinh (a + 6). 
Substituting for sinh a and cosh ὁ from Definition 6:3 we obtain 


et —— οτὰ οὗ Ἤ οὔ ρα + e7a eb aes eo e(a+b) = e—(a+b) 


—<<<< SY — see ge ee iO ee ns 


2 2 2 2 2 


which proves our result since [e'¢+) — e~(+0)}/2 = sinh (a + δ). Similar 
manipulation establishes the validity of all the identities listed below in 
Table 6-1. 


Table 6-1 Identities for hyperbolic functions 


sinh (x + y) = sinh x cosh y + cosh x sinh y; (6:29) 
cosh (x + y) = cosh x cosh y + sinh x sinh y; (6°30) 
cosh? x — sinh? x = 1; (6°31) 
tanh? x + sech? x = 1; (6:32) 
1 + cosech? x = coth? x. (6°33) 


Table 6:2 Derivatives of hyperbolic functions 


d 

— (sinh x) = cosh x; (6:34) 

dx 

ἃ : 

— (cosh x) = sinh x; (6:35) 

dx 

d 

— (tanh x) = sech? x; (6:36) 

dx 

d 

— (coth x) = — cosech? x; (6:37) 

dx 

d 

cp (cosech x) = — cosech x coth x; (6:38) 
x 

d 

ax (sech x) = — sech x tanh x. (6°39) 


a a AE 5 a I TD IE 
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Appeal to Definitions 6:3 and 6-4 together with the differentiability 
properties of the exponential function establishes Table 6-2, the table of 
derivatives. 

The behaviour of the hyperbolic functions is indicated graphically in 
Fig. 6-2 and for comparison the graphs of y = ξεῖ and ν = }e~* have been 
added to Fig. 6:2 (a). 


Functions inverse to the hyperbolic sine and cosine are introduced 
through the following definitions. 


DEFINITION 6.5 The inverse hyperbolic sine, arcsinh x, and the inverse 
hyperbolic cosine, arccosh x, are defined by the relationships: 


(a) y = arcsinh x = x = sinh y; 

(b) y = arccosh x = x = cosh y. 

Their derivatives are readily obtained by direct use of this definition and 
we illustrate the process by deriving d/dx arcsinh x. 


If y = arcsinh x, then x = sinh y and so, differentiating with respect to 
x, we obtain 


dy 
1 = h —?3 
Cos ye 
and so 
dy Lt Ι 


ἀχ coshy (1 + sinh? y) 
by virtue of identity (6-31) and the fact that cosh y is essentially positive. 
Hence, using the fact that x = sinh y, we find that 


d [ 

τς (aresinh x) = —————~ for all x. 

ae ) VU + x2) x) 

In the case of y = arccosh x we must proceed with more care. 

If y = arccosh x, so that x = cosh y, then, as before, differentiating with 
respect to x gives 


dy 
1 =sinhy. = 
sinh y ae 
or, 
dy 1 
dx sinh y 


6 


CH 
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y=sinh x and y=cosh x 


(a) 


coth x 


(c) 


. 
3 


tanh x 


— 
— 


(b) y 


cosh x: 


- 


hx and y 


sin 


ay 


( 


1Ons: 


> 


Fig. 62 Hyperbolic funct 


(c) y = coth x 
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y = cosech x y = sech x 
(d) (e) 
Fig. 6-2 (continued) (d) y = cosech x; (e) y = sech x. 


Now from the graph in Fig. 6-2 (a) we see that sinh y is positive if its argument 
arccosh x > Ὁ and negative if arccosh x < 0. Thus two different inverse 


functions must be defined. 
If arccosh x > 0, then 


Table 6.3 Derivatives of inverse hyperbolic functions 


d ee I 
arcsinh -- | = ——————__, for all x 

dx a V(x? + a?) 

d 1 

— [ arccosh - oe --------- for arccosh = > 0 and υ a" 

dx a / (x? — a*) a a 

d —1 

— arccosh = Ξ-------, for arccosh ε <0 and = 51: 

dx Ve =a") a a 

d x a 

ae (arctan : = 7 xe for x2 << a?; 

d a 

— | arccoth Shc ———, for x2 > a?; 

dx a az — x? 

d Ἢ 

— | arccosech ἐδ ΤΞ a ‘ for all x; 

dx a x (x? + a?) 

d —a | x x 

-- arcsech = = ——__~_—, for arcsech - > OandO< -< 1; 
dx xV/ (a? — x?) : a a. 

ς arcsech = for arcsech 2 <QOand0< = < | 
- = ---.----- = an - 5 
ἀχ χν (αΞ — x?) a a 


(6.40) 


(6:41) 


(6:42) 


(6-43) 


(6:44) 


(6.45) 


(6-46) 


(6:47) 
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dy lo Ι, ᾿ ἔτ 
dx > sinh } " 1/(cosh? y _ 1) ΝΝ ν (2 ΝΗ 1) or x :» . 


Conversely, if arccosh x < 0, then 

dy ol —1 ol 

dx sinhy (cosh? y—1)) VW(x2—1) 

Other inverse hyperbolic functions are defined similarly and it is left to 
the reader to verify the remaining entries in Table 6-3. (In many books 


the inverse function is denoted by a superscript —1, when sinh7! x is written 
in place of arcsinh x, etc.) | 


forx > 1. 


The following examples are representative of the limiting and differenti- 
ability problems encountered with hyperbolic functions. 


Example 6°9 
5 sinh 3x + xe® 


3 


(a) Evaluate a A632 


(b) Find f’(x) if f(x) = sinh (x? + 3x 4+ 1)1; 
(c) Find f’(x) given that f(x) < 0 is given by f(x) = arccosh (sin? x); 
(d) Determine f; and fy given that f(x, vy) = xy cosh (x? + y?). 


Solution (a) From Definition 6:3 it is easily seen that for large x 
sinh 3x = $e%%, 
Hence, applying the usual arguments, it follows at once that 


. Ssinh3x + xe™ | (Se8%/2) + χοῦ 5 
lim = lim ————__—_—_- = -. 


Paes άς 32 pene 4e32 8 
1 (2x + 3) 
(0) f’(x) = [cosh (x? + 3x + 1)1/3] 3° G24 3x4 τ 
so that 
(2x + 3) ‘ 
(x) = ——— ὁ“ })1/2, 
F(x) Wo® p 3x db piace + 3x + 1) 


(c) Set y = arccosh (sin? x) so that 
sin? x = cosh γ. 
Differentiation with respect to x then gives 


dy 


2sinx.cos x = sinhy. — 
dx 
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or 
ἂν 2sinx . cos x 
dx sinh y 
As we are told that y = f(x) < 0 it then follows that 
dy  —2sinx.cosx  —2sin x .cos x 
dx  +/(cosh? y — 1) s/(sin* x — 1) 
provided sin x # 1. 


of 


(d) aS = y cosh (x? + y?) + xy 0/dx cosh (x? + y?) 
x 


= y cosh (x? + y?) + 2x2y sinh (x? + y?). 
Similarly, 


o 
z = x cosh (x? + y?) + 2xy? sinh (x? + y?). 


6.5 Exponential function with a complex argument 


If we formally replace x by ix in the series expansion of εὖ in Theorem 6-1 
we obtain 
F xe x8 x4 x? x6 xm 
Φ — χω. 3. σις o 8 ‘nm . 8 
εἴ = 1 + ix age ath +i 6. t ἘΠ ἀμ]: 
Clearly οὗν is a complex number for any fixed real number x and, writing 
it in the form e” = C(x) + iS(x), it follows by equating real and imaginary 
parts that 


χ χὰ κχθ 
a ὙΞΡ ΘΒ Ὁ ὐρσδεν ων ayy een 
C(x) 51 era +: +++(-1) ὧς; yt 
and 
x xo x? xentl 
ΜΝ seo cee! eet és te τὰ =) :Ξ-- ο- 
ὌΞΟΣ ἢ a Ἔ-Ξ πε 


Thus, in fact, if x is regarded as a variable, S(x) and C(x) are functions of 
x and e' is, in some sense yet to be properly defined, a function of a complex 
variable. 


Assuming that the series for C(x) may be differentiated term by term it is 
easily verified that 
x x? xentl 


x3 
‘ engere: ag ae ne ogee a) fee ee) a ee 
ANTE τῶν: aga rE ceria 
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a 
Next, differentiating C’(x) again with respect to x yields 


xen 
(2n)! 


, XP xP XP 
Oa lta ee ee ee 


showing that in fact 
C’(x) = —C(x). 
Now, setting x = 0 in the series for C(x) and C’(x), we find that 
C(0) = 1 and C’(0) = 0. 


Hence the function C(x) is seen to be the solution of the special differential 
equation 


with y(0)=1 and y'(0)=0. 


This same differential equation with the conditions on y was encountered 
in Example 5-13 (a), where it was derived as the equation satisfied by y = cos x 
and its derivatives. Thus the function C(x) is, in reality, the function cos x. 
An analogous argument establishes that S(x) = sin x. On account of this 
identification of C(x) and S(x) we may write 


οἷς = cosx + isin x. (6-48) 


As a direct consequence of replacing x by —x in Eqn (6-48) and using the 
fact that cos x is even, but sin x 1s odd, we find that 


eZ = cos x — isin x. (6-49) 
Combination of Eqns (6-48) and (6-49) leads to the following definitions of 
the sine and cosine functions. 


DEFINITION 6°6 
elz — etx εἶ 4 e 
sin x = —————_ and cos xX = ————__- 
2i Z 
Comparison of Eqns (4:15) and (6-48) shows that e*” represents a complex 
number of unit modulus lying on the umit circle drawn about the origin. 
The argument of e* is x. 


Slightly more general than Eqn (6-48) is the complex number e+) for, 
by the property of indices together with Eqn (6-48), we have 


e(t+tv) = δος εἶν = e%(cos y + isin y), (6-50) 
showing that 


| elttty) | = et and arg ett) = γ, (6.51) 
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εἰ 
Thus the modulus-argument form of a general non-zero complex number z 
may be written 


278 
where 
r= 17 | and @ = argz. (6:52) 


This is, of course, an alternative form of Eqn (4-15). 

As it is true for any exponent « that (a7)* = a®*, it follows that (εἴ) = 
ez, so that from Eqn (6-48) we arrive at the result 

(cos x + isin x)* = cos ax + isin ax. (6-53) 
This is simply de Moivre’s theorem (Theorem 4-2) for any exponent « and 
not just for the integral values used in the first proof of this important theorem. 

To close, let us apply these results to give an alternative derivation of the 
results of Example 4.10, and also to express sin” # and cos” 6 in terms of 


sums involving sin r6 and cos r6, as promised in that example. As in Chapter 
4, the argument is best presented by example. 


Example 6-10 


(a) Express sin@ and cos 7 in terms of cos @ and sin 6. Deduce the 
form taken by the result when n = 4. 


(b) Express cos? @ in terms of cos r@. 
(c) Express sin® 6 in terms of sin r0. 


Solution 

(a) 

cos nO = Re(e’”®) = Re[(e?’)”] = Re[(cos 6 + isin 6)". 

sin nO = Im(ei”?) = Im[(e)"] = Im[(cos 9 + isin 6)"]. 
When ἢ = 4 we have 

(cos 6 + isin 6)4 = cos* 6 + 4i cos? 6 sin 6 — 6 cos? 6 sin? 6 


— 4icos 6 sin? 6 + sin‘ 6. 
Hence 


cos 46 = Rel(cos θ 4+-isin 9)4] = cos? θ — 6 cos? 6 sin? 6 + sin‘ 6 
and | 
sin 40 = Im[(cos 6 + isin 6)4] = 4(cos? 6 sin 6 — cos @ sin? 6). 
(b) From Definition 6-6 we may write 
ef 4 =) 


4 = 
cos’ 8 ( 5 
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Expanding the right-hand side by Y Binomial theorem, simplifying and 
grouping terms, we obtain 


τ Ι /e7 4 ς-τίθ eS! 4. ε- δἐθ οϑίθ 4 9318 
CO = -- 1-ττ-᾽ - -..- _ — 
ee ( 2 2 2 
ἐθ --ἰθ 
+35 —s"). 


Again using Definition 6-6, we see that this immediately simplifies to 
cos’ 9 = τὰ (cos 70 + 7 cos 56 + 21 cos 30 + 35 cos @). 


(c) From Definition 6-6 we may write 


εἶθ. @-i0\5 
πε cela A 
sin ( Fi 
Expanding the right-hand side, simplifying and grouping terms gives 
1 /e3t@ — e546 e319 _ e318 εἶθ. e-i0 
in? @ = — | ——_———_ — § —__——_ 4 10 —_____ }.. 
ae 21 ( 2i 2i = 2i 


Again appealing to Definition 6-6, we see that this immediately reduces to 
] 
51η5 0 = Té (sin 50 — 5 sin 30 + 10 sin 6). 


A variant of the method used here and in example (b) above is to be found 
outlined in Problems 6-37 and 6-38. 


PROBLEMS 


Section 6:1 


6:1 Solve the differential equation dy/dx = y, with y(0) = c, as in Section 6:1, by 
substituting 


οὐ 
ΞΞ > arx', 
r=0 


Hence deduce that, provided c 4 0, the differential equation has the non- 
trivial solution y = ce?. 


62 The function y = e~ satisfies the differential equation dy/dx = —y, with 
γ(0) = 1. Use the method of the previous problem to verify the series solution. 


6:3 It follows from the argument preceding Eqn (6:16) in Section 6-1 that 
ἀπ: 
0< Sn -- Sr< (R-D! 
where the integer R > 2x. Use this result to deduce the least number of terms 
that must be included in the series expansion of e? in order that the error 
involved is less than 0-01. 
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6-4 Evaluate the following limits: 
(a) lim de?? + xe™ + 3. 
‘gow Sx + er + 1’ 
(x2 + 1e?? + e7 + 1. 


3 


La ae GPC Ba a Πεῖ: 
᾿ (2 -- x*)e* + 3 
I BOE ET 
ΗΝ 1+(2+ xje** 
3(2)ῆς-. ὃ: + x2 + 1) 


μὰ a 4e2z + 2x + | 


6:5 Make use of the series expansion of δῇ to evaluate the following limits and 
verify your result by using Theorem 5-14: 


ὃ: ... 
(a) lim - ": 
γ-«Ό 3x 
lim 
τοῦ sin 4x 
. e—1-x 
lim —————_. 
() eee 3x2 


6.6 Differentiate the following functions: 
(a) f(x) = 2e* cos x; 
(b) f(x) = e* arcsin x; 
(c) f(x) = εχ; 
(d) f{() = ersinz, 


6:7 Differentiate the following functions: 
(a) f(x) = arcsin εἷς; 
(0) f(x) = v'(xe? + x); 
(c) f(x) = sin (xe? + 2); 
(4) f(x) = (Ὁ — I/e* + 1). 


Section 6:2 
6:8 Differentiate the following functions: 
(a) f(x) = 3 exp [-(x? ++ x+ 1]; 
(b) fo) a esin® x. 
(c) f(x) = cos [exp (x sin x + 2)]. 


6-9 Find the second derivatives of the following functions: 
(a) f(x) = e®**; 
(b) f(x) = sin (1 + e*); 
(c) fx) -- 65:8 τ, 
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6:10 Consider the function f(x) defined as follows: 


ah 


61 


6:12 


6:13 


6:14 


e-1/z? for x 40, 


70) Ξ 0 


Clearly the differentiability properties of this function at the origin must be 
deduced directly from the definition of a derivative. To deduce these properties 
show first that for x + 0, it follows that 


for x = 0. 


; 2 
fa) = Sere" 


Then, by using Definition 5-2 together with Theorem 6-3, prove that f’(0) = 0, 
and hence deduce that | 


lim f’(x) = f’(O) = 0. 
r—0 


Finally, deduce that in general, 
f(x) = εὐ x (Polynomial in 1/x), 


and hence by using an inductive argument prove that f((0) = 0 for ail n. 
This is an example of a function which is capable of differentiation an arbitrary 
number of times for all x, and yet which has every derivative equal to zero at 
one point of its domain of definition. 


Find of/ex and δύ ὃν, given that 
fx, y= esin (y/x) 


Show that u = xy + xe4/* satisfies the equation 
Cu. Gu - 
τ py δὼ 
Find df/dt, given that 
f(x, y) = eer7y 
where x = cos ἢ, y = sin 1. 
Find éf/éu and éf/év if 
f(x, y) = 2 arctan 


with xX = usinv and y = ucosv. 


Section 6.3 


6:15 


Evaluate the following limits: 


ee 2 
(a) tim = PB 
wy. log 3 + 22). 
os alse OT πο. 
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log (3 sin x) — log [([ + x) sin na. 


(c) oo Fee | 
(d) Jim [log Gx + 1) — log @x + 5)]; 
(e) lim 2 = 
ae x 
616 Let f(x) and g(x) be functions such mae an 1 fC) = 0 ang Hi 1g) = 0 but 
lim £@) = A, Then 
rea σία x) 
m 28 ll + fool ca = lim log [ εὐ = lim 1 log {1 + μία), 


However, it sie. from Chapter 3, Sor ᾿ that lim [1 + f(D} = ς, 
ta 
so that 


lim log ΗΠ + foo] _ lim log ef (g(x) = lim 10) ΟΝ = A, 
ra & (x) ra x20 g(x g(x) 
Apply this result to evaluate the following limits: 


ἜΤ log (1 + 11); 
χτ-»0 2χ 


(by tigi log (1 + 3 sin x). 
xr-+0 x 
. log [1 — 2 sin? (x/2)]. 
(c) a ee 


use your result to deduce lim (cos xia? 
x—+0 


6:17 Apply Theorem 5-14 to evaluate the limits in Problem 6°16. 


6:18 Differentiate the following functions: 


(a) f(x) = log (x3 + 7x? + 2); 
(b) f(x) = log sin 2x; 


(c) f(x) = log cos ( = a} 


6:19 If ν = [f(x)]*™ then, taking the natural logarithm, 
log y = g(x) log f(x). 
Hence, differentiating with respect to x, it follows that 


oY - =|s (x) log f(x) + a = f | [f(x)}9 


Use this result to differentiate the following functions: 
(a) ν = x"; 

(b) ᾿Ξ = (sin 2x)*; 

(c) y= = xsin ας 

(d) y= es {Oleg sin x 
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6:20 If u = x log (1 + x/y) + y log (1 + y/x) 
eeu eu 
Pll ae pela’ 
show that x ae aye 
6:21 Find the total derivative dz given that 
z = log (x? + 2y?). 
6:22 Show that the function 
f(%, y) = arctan y/x + log (x? + y?) 
satisfies the equation 


053 02 
Lyf =o 


ox? ey? 
6-23 By taking logarithms deduce éu/éx, éu/éy, and éu/éz if u = (xy). 


Section 6.4 
6:24 Use the definitions to establish the form taken by: 
(a) sinh x; 
(b) cosh x; 
(c) tanh x; 
when x is large. Distinguish between x large and positive and x large and 
negative. 


6:25 Prove by means of the definition that 
(cosh x + sinh x)” = cosh nx + sinh nx. 


26 Use the definitions to verify any three of the identities contained in Table 6-1. 


6.27 Prove by means of the definitions that: 
(a) 2 sinh x cosh y = sinh (x + y) + sinh (x — y); 
(b) 2 cosh x cosh y = cosh (x + y) + cosh (x — y); 
(c) 2 sinh x sinh y = cosh (x + y) — cosh (x — y). 
6-28 Verify any three of the entries in Table 6-2. 
6729 Verify the derivatives of arccosech x/a and arcsech x/a given in Table 6:3. 


6:30 Evaluate the following limits, using the series (6-27) and (6:28) where necessary: 


aim x3 cosh 2x + e% ; 
go (2x8 + x + 1)e2* + x8e-22’ 
(b) lim x3 cosh 2x + οἵ 
o> — oo (2x8 + x + 1)e2* 4+ x3e~-22’ 
ihn sinh as 
x—0 
1 — cosh 2x 
d) lim ---ο----------- 
(d) at 4x2 
6°31 Differentiate the following functions: 
(a) f(x) = sinh 2x cosh? x; 
(b) f(x) = exp (1 + cosh 3x); 
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(c) f(x) = log (tanh x); 
(ὦ f(x) = arcsech (x? + 3) if f(x) > 0; 
(e) f(x) = cosh (sin 2x). 

6:32 Evaluate éu/éx and Cu/ey given that: 
(a) u(x, y) = sin x cosh xy; 
(Ὁ) u(x, y) = sinh (x? + x sin y + 3y”); 
(c) u(x, y) = xcoshia? + 29°), 


Section 6°5 
6-33 Establish by means of the definitions that: 
(a) sin (iz) = i sinh z; 
(Ὁ) cos (iz) = cosh Z; 
(c) sinh (iz) = isin Ζ; 
(d) cosh (iz) = cos z. 
6:34 Given that a, b are positive real numbers, deduce four trigonometric identities 
by equating real and imaginary parts in each of the following results 
οἷα eid — eilath) and οἷα etd = et(a-d), 


6:35 Express the following complex numbers in the form re‘: 


(a) 1 +i; (1 --᾿-ἰἰ; (Ο —8073 — 1); 
(ὦ (~1 +08 © (5 Ὁ 14/4 + ἢ. 


6.36 Show by means of de Moivre’s theorem that: 


(a) 32 cos* 6 = 10 + 15 cos 26 + 6cos 46 + cos 60: 
(b) sin 76 = 7 sin 9 -- 56 sin? 6 - 112 sin® 6 — 64 sin’ 6. 


6:37 Verify that if z = εἶθ, then 


1 . 
ἝΞ. z+- and Se eae iene 
2 Ζ 2 Ζ 
and, more generally, 


1 1 ; Ϊ i 
cos ré =3(2+ τ and sin r? = air) 
By replacing cos @ and sin @ by their equivalent expressions involving 2, 
make use of these results to express cos? 4 sin? 6 in terms of sin 8. 


6°38 Use the method of Problem 6.37 to express sin® 6 in terms of cos πθ. 


6.39 Consider the function cosh z, where z = x + iy. Then, using Definition 6-3, 
deduce that cosh z = 0 when z = (2n 4+ 1)7i/2, with n = 0, +1, 42,.... 
Use the results of Problem 6:33 to deduce the zeros of cos z. 


6-40 Consider the function sin z, where z = x + iy. Then, using Definition 6-6, 
deduce that sin z = 0 when z=a7, with n=0, +1, +2, .... Use the 
results of Problem 6533 to deduce the zeros of sinh z. 


Fundamentals of 
integration 


7:1 Definite integrals and areas 


The work of this chapter is concerned with the theory of the operation known 
as integration, which occupies a central position in the calculus. The connec- 
tion between differentiation and integration is basic to the whole of the 
calculus and is contained in a result we shall prove later known as the funda- 
mental theorem of calculus. Once again, limiting operations will play an 
essential part in the development of our argument. In fact we will show not 
only how they enable a satisfactory general theory of integration to be 
established, but also how they provide a tool, albeit a clumsy one, for the 
actual integration of functions. However, aside from a number of simple but 
important examples, the practical details of the evaluation of integrals of 
specific classes of function will be deferred until Chapter 8. 

We begin by seeking to determine the shaded area 1 of Fig. 7:1 which is 
interior to the region bounded above and below by the curve y = f(x) and 
the x-axis, respectively, and to the left and right by the lines x = a, x = b. 

This approach will lead naturally to what is called the definite integral of 
J(x) over the interval a< x < δ, and it illustrates a valuable geometrical 
interpretation of the process of integration. Although we use the definite 
integral to give precise meaning to the notion of the area contained within a 
closed curve, this appeal to geometry is not actually necessary when defining 
an integral. Indeed, we shall also show how a purely analytical definition of 


Fig. 7-1 Area J defined by y = f(x). 
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a definite integral, quite independent of any geometrical arguments, may be 
formulated. 

Let f(x) be a non-negative continuous function defined in the closed 
interval [a, Ὁ] and consider, for a moment, the conceptual problem that arises 
when trying to determine the area 7 defined by it in Fig. 7-1. The only simple 
plane geometrical figure for which the concept of area is defined in an ele- 
mentary and unambiguous manner is the rectangle, so that we shall seek to 
define the area J in terms of the limit of a sum of rectangular areas. It should 
perhaps be remarked at this point that the derivation of the formula ar? for 
the area of a circle of radius r involves the concept of integration, although 
this is invariably avoided in any first encounter by the employment of 
arguments that are at best only plausible. 

We shall start our discussion from the postulates that (a) the area of a 
rectangle is given by the product length x breadth, (b) the area of the union 
of two non-overlapping rectangles is the sum of their separate areas, and 
(c) if a rectangle is divided into two parts by a curve, then the sum of the 
separate non-rectangular areas comprising these two parts is equal to the 
area of the rectangle. 

On the basis of postulate (c), we at once see that the area 7 in Fig. 7:1 
exceeds the rectangular area ABEF, but is less than the rectangular area 
ACDF. Letting m, M denote, respectively, the minimum and maximum 
values attained by f(x) in [a, 6], this result becomes 


m(b — a) <1< M(b — a). (7-1) 


This inequality, although interesting, must obviously be refined if it is 
ever to lead to the actual value of 1. In principle, our approach will be simple, 
for we shall begin by dividing [a, 5] into n adjacent sub-intervals in each of 
which an inequality of type (7-1) will apply, after which we shall use postulate 
(b) to find better upper and lower bounds for J. 

Specifically, we start by choosing any sequence of ἡ + 1 numbers xo, 
X1,. . +, Xn Subject only to the requirements that x9 = a, x, = b, and 


Xo ΞΖ X1 << * 4° < Xn-1 <= Xn. 


The sequence {x;}",-0 so defined is called a partition P of the interval [a, 5], 
and for any given value of n it is obviously not unique. Next, on each sub- 
interval [xi-1, xi], let the function f(x) attain a minimum value m; and a 
maximum value Με and denote the length of the ith sub-interval by Aj, so 
that 


Δι = Xi — Xi-1. 


We now define numbers Sp and Sp called, respectively, the /ower and upper 
sums taken over the partition P, by the expressions 


Sp = mA; + moAg τ ° + + mnAn = Σ my, Ay (7:2) 
r=] 
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and 


Sp = Μιδὶ + Μεδὶ +++ ++ Mndn => Μ,Δ,. (7-3) 
r=1 


Clearly, as Figs. 7-2 (a), (b) illustrate, Sp and Sp are, respectively, under- and 
over-estimates of the area 1. 


The fact that Sp < Sp is apparent on geometrical grounds, but it also 
follows without appeal to geometry by considering the difference 


Sp — Sp = (Mi — mi)Ai + (Mz — πιο)δα ++ + + + (My — my)An. 


Ce er i camenaieal 
Ay . Ae ne Ai Ae An 
(a) (b) 


Fig. 72 (a) Shaded area represents lower sum Sp; (b) shaded area represents 
upper sum Sp. 


In this equation we have, by definition, Ay > 0 and M, > m, for r = 1, 
2,.. .,”, so that 


Sp — Sp >0 or, Sp < Sp, 
and thus by postulate (c), 
Sp< I< Sp. (7:5) 


It would seem reasonable to suppose that as the number zn of points in a 
partition increases, provided the lengths of all intervals shrink to zero, the 
limit of both the lower and upper sums must be 7, the desired area. We prove 
this in two stages, first considering the effect on the lower and upper sums of 
the refinement of the partition P by the inclusion of extra points. 

It will suffice here to consider only the effect of the inclusion of one extra 
point x,’ between xr-1 and x; in the partition P. The resulting partition P’ is 
called a refinement of P, in the sense that although P’ has more points than P, 
all points of P are also points of P’. | 

Suppose that in the intervals [xr1, xr’] and [x,’, x] the function f(x) | 
attains the minimum values m,’ and m,”, respectively. Then the effect of the 


(a) 
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extra point is to replace the term mx; — Xr-1) in the lower sum Sp by the 
sum mr(xr’ — Xr-1) + mr"(xr — Xr’) thereby generating the sum Sp’ appro- 
priate to the refinement P’ of the partition P. As it must be true that m, < m,’ 
and m, < m,”, it thus follows that 


γι, (Xr° — Xr-1) + my’ (Xr — Xr’) 2 γι (χε = Xr-1), 
whence 
Sp <= Sp’. (7-6) 


Identical reasoning involving the maxima M,’ and M,” attained by f(x) in 
the intervals [xr-1, χε] and [xr’, x] establishes that 


Sp’ < Sp. (7:7) 


m td ταν. 
r See tras tet SAAR 


O Xr-1 Xr x 


O Xr-1 Xr Xr 


Fig. 73 Effect of refinement of a paruuon: (a) area inequality on interval 
[xr_1, Xr] of P; (b) area inequality on interval [xr_1, xr] of Ρ΄. 


The inequalities leading to results (7-6) and (7-7) are illustrated geometric- 
ally in Figs. 7-3. Thus in Fig. 7-3 (a) the area inequalities associated with the 
interval [xr_-1, xr] of P are displayed, whilst in Fig. 7-3 (b) the corresponding 
situation is displayed for the refinement P’ produced by inserting an addi- 
tional point x,’ in [χγ--1, xr]. 

_ The further refinement of the partition P’ by the inclusion of additional 
points only serves to reinforce results (7:6) and (7-7). We have thus estab- 
lished that if the partitions Pi, P2,. . ., Pm are successive refinements of the 
partition P, then 


mb — a) < Sp, < Sp, 33" ; “<< Sp <1< Sp, ΞΞ δ... — vss 
16 +< Sp <M(b—a). (78) 


Expressed in words, the effect of refinement of a partition is to increase the 
corresponding lower sum and to decrease the corresponding upper sum, so 
that {Sp} is a monotonic increasing sequence of numbers, and {Sp} is a 
monotonic decreasing sequence of numbers. 

For the second and final stage of our argument we introduce the norm 
|| ΔΊ!» of a partition P by means of the definition 
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|| A [|p = max (xs — 24-1). (7:9) 


That is to say, for any partition P of the interval [a, b], the norm || A | jp is 
the length of the longest sub-interval of [a, δ] produced by the partition. 

Let us consider a sequence of partitions which are successive refinements 
of P and are such that 

lim || A ||p, = 0. 

m— CO 
Then by the postulate of Section 3-2, as {Sp} is monotonic increasing and 
bounded above it must tend to a limit S and, similarly, as {Sp} is mono- 
tonic decreasing and bounded below it must tend to a limit S, where 


5.-- 1: 8. * (7:10) 
To show that δ᾽ = δὶ, as would be expected, observe that if 


Op = max (M; — m;) for all i, 


then Eqn (7-4) gives rise to the inequality 
Sp — Sp< dp(Ai + Ae ++ + - + An) = δρ( — a). (7-11) 


Hence, for any sequence of partitions P1, P2,. . ., Pm, . . . which are refine- 
ments of P with the property that lim || A ||» —0, it follows from the 
continuity of f(x) that lim ὃ», --» 0, thereby showing that (δ, — Sp} is a 
null sequence. Thus {Sp} and {Sp _} both have the same limit. 

Taken in conjunction with Eqn (7-10), we have proved that because of the 
continuity of f(x), the limit of the lower sums is equal to the limit of the upper 
sums, and each is equal to the limit J which has been interpreted as the 
shaded area in Fig. 7:1. 

The limiting argument used above certainly suffices to define the area /, 
but before formulating our definition of the definite integral, let us first make 
a useful generalization of our argument. With the partition P used earlier 
associate any set of m numbers &), fe, . . ., £n for which it is true that 


Xo or = Xi, X1 <i ξο -Ξ2Ξ Xa, . bag Kasi = oe = Χῆς 


Now form the approximating sum Sp defined by 


Sp = ξι)δι + f(G2)A2 ++ + + f(En)An. (7°12) 
Then because m < f(&) < Mi, it follows at once that 

Sp, S Sr_g S Srp (7-13) 
for all refinements Pi, Po, .. ., Pm,.. . of the partition P. Consequently, 
since lim Sp, = lim Sp, = J, it follows immediately from Theorem 3:6 that 
lim Sp, Ξε & 


This important result asserts that if f(x) is continuous on [a, δ], then as 
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the partition is refined, so the corresponding upper and lower sums δ,» 
Sp, and the approximating sum δ. all converge to the same limit. We now 
state this as our first fundamental theorem which forms the basis of our 
development of the integral. 


THEOREM 7:1 (first limit theorem for sums defined on a partition) Let 
f(x) be a continuous non-negative function on the closed intervala<x< ὃ, 
and let Pi, Po,. . .,Pm,. . . be a sequence of successive refinements of some 
partition P of [a, δ] with the property that lim || A ||p, = 0. Then, if & is any 
point in the ith sub-interval of length A; generated by the partition P», and 
Sp, and Sp, are respectively the lower and upper sums associated with Pm, 
it follows that 

lim Sp, = lim Sp, = [πῇ Σ SE Δι. 

m-> x mM-~-> CO | A ΠΡ. Ὁ i=1 

This theorem suggests the following form of definition for the definite 
integral. 


DEFINITION 7:1 (definite integral of a continuous non-negative function) 
Let f(x) be a continuous non-negative function on the closed interval 
a<x<b,and let Pi, Po,..., Pm,.. . be a sequence of successive refine- 
ments of some partition P of [a, δ] with the property that lim || A ||Pm = 0. 
Then, if &; is any point in the ith sub-interval of length A; generated by the 
partition Pm, the definite integral of f(x) integrated over the interval [a, 5], 
and written symbolically 


[ Sf (x)dx, 


is defined to be 


b 
| f(x)\dx = lim Σ FE Ni. 


A |\\Pm-—-0 i= 


In the context of a definite integral, the function f(x) is called the inte- 
grand, the numbers a, b are called the lower and upper limits of integration, 
respectively, and the sign f itself is called the integral sign. 

In summary then, a definite integral of a positive continuous function 
f(x) integrated over the interval (a, b]is a positive number defined by means 
of a limiting process. It may be interpreted geometrically as the shaded area 
I below the curve y = f(x) as shown in Fig. 7:1. 

To show that this is a working definition, in the sense that it can be used 
to yield a useful answer, let us now apply it to a simple function. 


Example 71 Evaluate the definite integral 


b 
[ x2 dx, where a< ἢ. 


ea 
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Solution As x? is everywhere continuous and is non-negative on the stated 
interval Definition 7-1 applies. Thus we start by considering a convenient 
partition P, in which [a, δ] is divided into m equal sub-intervals, each of 
length A = (ὁ — a)/n. Then, if for convenience we identify €; with the right- 
hand end-point of the ith sub-interval, we have 


ξ Ξε α - Δ, ξο Ξε α - 2Δ, gs =a+3A,...,6,2=a+n. 


Hence, from Definition 7:1, 

T=lim Σ᾽ (a+ [Δ)3Δ. 

πρὸ t=1 

Expanding and grouping the terms of the summation then gives 

IT=lim [na?A + 2αΔ ! 2-3 ἘΠ᾿: Ὁ ἡ) 

+ Δθ(2 + 22 4+ 35 --΄' . - + η5}}. 

Using the fact that A = (ὁ ~ a)/n together with the well-known results 

1213 ἘΠῚ  Έπ πσοι Ἐ1) 


and 


= n(n + 1)(2n + 1) 


124224 32... .- 72 ᾿ 


it follows that 


I= lim {2% — a)+a(b — a)? keene 


n~-> © 


16-08 Ε + 1)(2n + >} 


653 
Thus, taking the limit, we find 
1 = 4(03 — a), 


and so 
b 
[ χὸ ἀχ = 1(63 — a3). 
In terms of numbers, if a = 1, ὁ = 2, then 
2 7 
[ χϑ ἀχ = 1(23 -- 18)= a 
1 


When the behaviour of f(x) is monotonic over the interval a< x < ῥ, 
then Theorem 7-1 coupled with Definition 7:1 can often be used to derive 
interesting and useful series approximations to the definite integral as the 
following example illustrates. 
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Example 72 Show that 


Σ 1 = * dx 1 
rai\n+r—-l) ἢ x pei \ntr 


Solution In this case f(x) = 1/x, which is continuous, positive, and mono- 
tonic decreasing on the interval [1, 2] so that Theorem 7-1 and Definition 7:1 
apply. We again choose a partition P, which divides the interval [1, 2] into 
n equal sub-intervals of length A = 1/n. The general point x; in the partition 
Py is, of course, xy = 1 + r/n so that 


Mea 


n 
n+r 
Thus as f(x) is monotonic decreasing, it follows that on the interval [xr-1, xr], 


f(x) attains its maximum value M, at x;-1 and its minimum value mr at xr, 
where 


f(r) = 


_ n _ ἢ 
ntr—-l n+r 
Hence 


n n 1 ae n 1 

Sp. = - d Sp = —_—__—_ } -, 

=e Σ (τ); a i 3 (4); 
so that from Theorem 7:1 and Definition 7:1, we deduce that 


n Ι 2 dx n 1 
Ἐπ στ τ δος ἦξΘ 
Σ(ς σὴ Ξ ΞΕ Σ (45) 


A few numbers might help here, so we show in the table below the be- 
haviour of the upper and lower sums Sp, and Sp, as a function of x. 


Mr 


n SP, SP, 
5 0-7456 06456 
10 0:7188 0-6688 
15 0-7101 0:6768 
00 0-693) 0:6931 
We shall discover later that the exact result, which is shown in this table 
against the entry 2 = οὐ, is in fact loge 2. 
Before closing this section let us give brief consideration to the effect on 


Theorem 7:1 of removing the condition of continuity imposed on the function 
f(x) and substituting instead the condition that f(x) is bounded. The argu- 
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ment leading to Theorem 7:1 proceeds as before until the Stage at which Sp 
and 5, are defined. Then, without the continuity of f(x) to ensure that 
| Mr — mr | -> 0 as | xr — χα] > 0, it is no longer possible to infer that 
when lim Sp, and lim Sp, exist, they are necessarily equal. However, if they 
do exist and are equal, it follows as before that lim Sp, also converges to the 
same limit. Thus we arrive at the following more general form of Theorem 
TA, 
THEOREM 7:2 (second limit theorem for sums defined on a partition) Let 
I(x) be a non-negative bounded function defined on the closed interval [a, 5], 
and let Pi, Po,. . ., Pm,. . . be a sequence of successive refinements of some 
partition P of a< x < 6 with the property that lim || A ||p, = 0. Then, if 
δι is any point in the ith sub-interval of length A; generated by the partition 
Pm, and Sp, and Sp, are respectively the lower and upper sums associated 
with Pm, it follows that if 

lim Sp, =lim Sp, = J, 


M—> Ὁ M+ © 


it must also be true that 


[= lim Σ Κίξρ)δι. 


[ΔΉ προ ἐξβὶ 


The corresponding modification of Definition 7.1 is given below for 
reference and, because this form of definition was first given by B. Riemann 
(1826-66), the definite integral is known formally as the Riemann integral. 
Usually only the term definite integral will be employed. 


DEFINITION 7:2 (Riemann integral of a non-negative function) Let f(x) 
be a non-negative bounded function on the closed interval a< x < b, and 
let Pi, Po, . . ., Pm, . . . be a sequence of successive refinements of some 
partition P of [a, δ] with the property that lim || A ||p, = 0. Furthermore, 
let ξ; be any point in the ith sub-interval of length A; generated by the 
partition Pm, and let Sp, and Sp_ be, respectively, the lower and upper sums 
associated with Pm. : 

Then, if 

lim Sp, = lim S$, , 

7 -ρ ὦ Ti @D 
the Riemann integral of f(x) integrated over the interval [a, δ], and written 
symbolically 


b 
Ϊ Πολάν, 
is defined to be 


[ Seas = lim > fled. 


Allp,, 70 i=1 
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To show that not all bounded functions are Riemann integrable it is only 
necessary to consider the integral over the interval 0 < x < 1 of the function 


fix) = | 


Then clearly f(x) is non-negative and bounded on (0, 1], but by a suitable 
choice of the numbers é; in the approximating sum of Definition 7:2, the 
limit of the sum may be made to assume any value between zero and unity. 
This situation arises because the limits of the upper and lower sums are not 
the same. In more advanced accounts these difficulties are overcome by 
defining a more general form of integral known as the Lebesgue integral. 


1 for x rational 
0 for x irrational. 


7:2 Integration of arbitrary continuous functions 


As most functions assume both positive and negative values in their domain 
of definition, our notion of a definite integral as formulated so far is rather 
restrictive, for it requires that the integrand be non-negative. A brief examina- 
tion of the introductory arguments used in the previous section shows that 
this restriction stems from our idea of area as being an essentially positive 
quantity, although this was not stated explicitly at any stage in our argument. 

Nothing in the limiting arguments that we used requires either the upper 
and lower sums themselves, or any of the terms comprising them to be non- 
negative. Since a term in either of these sums will be negative when m, or 
Μ, is negative, that is, when f(x) is negative, it follows that the inter- 
pretation of a definite integral as an area may be extended to continuous 
functions f(x) which assume negative values provided that areas below the 
x-axis are regarded as negative. This is illustrated in Fig. 7-4 in which the 
positive and negative area contributions to the definite integral of f(x) 
integrated over the interval [α, b] are marked accordingly. 

Thus using this convention when interpreting a definite integral as an 
area, we may remove the condition that the integrand f(x) be non-negative 
throughout all of Section 7-1. Because it simply amounts to the deletion of the 
word ‘non-negative’, we shall not trouble to reformulate our earlier definitions 
and theorems to take account of this result. It is interesting to observe that 
had we introduced the definite integral via the upper and lower sums, without 
any appeal to graphs and areas, this artificial restriction would never have 
arisen. 

The definition of a definite integral of a function f(x) integrated over the 
interval [a, b] immediately implies a number of important general results 
which we now state in the form of a theorem. No proofs will be offered since 
the results are virtually self-evident. 


THEOREM 7:3 (properties of definite integrals) Let f(x), g(x) be continuous 
functions defined on the closed interval a< x < b, and let c be a constant 
and k be such that a < k < b. Then 
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(a) [ ᾿ f(x)dx = Ϊ ᾿ f(x)dx + : ' flx)dx (Additivity with respect to 
Jt a k 


interval of integration), 


(b) [σοὺ ΞΕ ef f(x)dx (Homogeneity), 


b b b 
(c) | (f(x) + g(x))dx -| I (x)dx +] g(x)dx (Linearity). 


Fig. 74 Positive and negative areas defined by y = f(). 


By virtue of these results, the definite integral of the function f(x) appro- 
priate to Fig. 7-4 could, if desired, be written in terms of the sum of three 
integrals involving non-negative integrands. To achieve this, notice that f(x) 
is negative for ki < x < ke, so that for all x in this interval, —f(x) is positive. 
Then, first expressing our integral as the sum of three separate integrals over 
adjacent intervals 


b Δι ko b 
{ f(x)dx = { f(x)dx 4 [(χ)άχ +| f(x)dx, (7:14) 
a a ky ke 
we can replace —f(x) by | f(x) | in the second of these integrals to obtain 
b ky ke b 
| Sf (x)dx =| f(x)dx -| | f(x) | dx εἰ [(Ἱ)άχ. (7:15) 


Each of these integrands is now the definite integral of a non-negative 
function as required. 

We must now take account of the fact that so far it has been implicit in 
our definition of a definite integral that x increases positively from a to ὁ, 
where b > a. This sense, or direction, of integration is indicated in the definite 
integral by writing a at the bottom of the integral sign { to signify the /ower 
limit of integration and by writing ὁ at the top to signify the upper limit of 
integration. If, despite the fact that b > a, their positions as upper and lower 
limits of integration are reversed, this implies that integration is to be carried 
out in the direction in which x increases negatively. Because we are now 
allowing areas to have both magnitude and sign, to be consistent we must 
compensate for a reversal of the limits of integration by changing the sign of 
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the integral. Hence we arrive at our next definition. 


DEFINITION 7:3 (reversal of limits of integration) If a <b, then we 
define the definite integral 


ἢ “ ͵7()άχ 


of a continuous function f(x) by the equation 
a b 
| f(x)dx = -| f(x)dx. 
b a 


Example 7.3 Evaluate the definite integral 


1 
{ 2x2 dx. 
8 


Solution From Definition 7:3 we have 


i 3 
[ 2x2 dx = -| 2x2 dx. 
3 1 


Hence an application of Theorem 7:3 (b) together with the result of Example 
7-1 shows that 


1 3 42 
ἢ 2x2 ἀχ = -2| x?dx = —2(4)(33 — 13) = -- a 
3 1 


Since a definite integral is simply a number, the choice of symbol used to 
denote the argument of the function f forming the integrand is arbitrary, and 
often it is convenient to replace x by some other variable, say ¢. Thus 


I f(odx and [ f(Odt 


are identical in meaning, so that 


b b . 
Ϊ ολάχ = [ ραι. (7:16) 


On account of this fact, the variable in the integrand of a definite integral 
is often called a dummy variable, and it is sometimes said to be ‘integrated 
out’ when the integral is evaluated. This fact is usually recognized in modern 
accounts of the theory of the definite integral by simply writing 


b 

[s 
in place of either of the expressions in Eqn (7:16). The full significance of the 
symbol dx, which is suggestive of a differential, comes when changes of 


314 / FUNDAMENTALS OF INTEGRATION CH 7 


Fig. 75 (a) Area J bounded by curves y = f(x) and y = g(x); (b) area below 
y =f (*); (©) positive and negative areas defined by y = g(x). 


variable of the form x = g(u) are made in Eqn (7-16) and it is for this reason 
that we choose to retain it. This matter will be taken up in detail in the next 
chapter, where it is shown that because of the chain rulé for differentiation, 
dx can indeed be interpreted as a differential. 

Now that the definite integral has been extended to arbitrary continuous 
integrands we are in a position to determine quite general areas. Consider, 
for example, the situation illustrated in Fig. 7:5 (a) in which it is desired to 
determine the area J of the shaded region. Then obviously, referring to 
Figs. 7-5 (Ὁ), (c) we have 


T=h+Ip—-13+ Ih, 


where J; to 14 represent the positive areas identified by these symbols. 
However, we know that 


b 
h -| I (x)dx, 
and from the form of argument leading to Eqn (7-15) we also know that 
ky ka b 
— 1. -| g(x)dx, 15 -ἰ g(x)dx, --1ὰ -| g(x)dx, 
a ki ke 


where k; and kg are the first and second points of intersection of y = g(x) 
with the x-axis as x increases from a to ὁ. 
However, by Theorem 7:3 (a) we have 


ὃ 
-- 710 "Ἔ 13 -- [4 -ἰ g(x)dx, 


so that combining these results we obtain 
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Fig. 7-6 Piecewise continuous function γ = f(x) defining a sequence of areas 
fh, Ia, “νὸν In-1. 


[= [ f(x)dx -- [¢ (x)dx. 


From Theorem 7:3 (Ὁ) it then finally follows that 
b 
I -| (f(x) — g(x))dx. | (7-17) 


Example 7.4 Find the area / between the two curves » = e** and) = —x?, 
which is bounded to the left by the line v = 1 and to the right by the line 
x= 3. 


Solution We start by making the obvious identifications f(x) = e?%, 
g(x) = —x?, a = 1 and ὁ = 3. Then from Eqn (7:17) it follows that 


3 
I -| (e227 +. x*)dx 
1 


whence, using the results of Example 7:1} and Problem 7:3, we find 


[= Ye — 2) + > 

The fact that a definite integral is additive with respect to its interval of 
integration enables a function to be integrated even when it has discontinu- 
ities, provided only that they are finite in number and that elsewhere the 
function is continuous and bounded. This result is perhaps best seen dia- 
grammatically, though an analytical justification can easily be given without 
appeal to geometry. By way of example, consider the function y = f(x) 
illustrated in Fig. 7-6 which is bounded and continuous everywhere except at 
the discrete number of points 71, 72, . . ., 7n. Such a function is said to be 
piecewise continuous, for obvious reasons. 


Using the valid interpretation of a definite integral in terms of area we see 
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that the total shaded area 1 is the sum of the sequence of areas Ih, Jo, . 
In4+1, 50 that we may still write 


b 
I ~| F(x)dx, (7-18) 


but this time with the understanding that 


[ S(x)dx ει f(x)dx + [ 


«1 


. 24 


᾿ [(ὐάχ +--+ ᾿ 10(.]ᾶν. (7:19) 


Yn 


Here, as before, we have used 7;— to signify the limiting process of 
approaching the point x = 7 from the /eft, and m+ to signify the limiting 
process of approaching the point x = 7; from the right. 


Example 7.5 Evaluate the definite integral 


; 
I -| f(x)dx 


when 


oe x? for O<x< il 
J) = ez for 1-ἰχ-Ξ2. 


Solution From Eqn (7:19) we have 
ee 2 
f= | x? dx +| et dx, 
J0 1+ 


so that evaluating the integrals and then taking the appropriate limits gives 
[ah + 410 — 8), 


Sometimes a more difficult situation than this arises in which either the 
integrand tends to infinity at some point in the interval of integration or, 
perhaps, the interval of integration itself is infinite in length. Such definite 
integrals are called improper integrals, and the way in which to attribute a 
value to any such integral is suggested by Eqn (7:19). 


Let us illustrate something of the difficulty that can arise if ideas are not 
made precise. Consider the integral 


[ dx 
J x? 


Then since y = 1/x? is essentially positive, the area under the curve must 
also be positive. Now if we apply the result of Problem 7:5 we have 
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which, since it is negative, contradicts our previous conclusion. What has 
gone wrong? The trouble is that 1/x? tends to infinity as x -- 0, so that the 
arguments of Problem 7:5 are not applicable, for it was pre-supposed there 
that the interval of integration excluded the origin. When dealing with 
improper integrals of this type in which the integrand has an infinity within 
the interval of integration we shall assign a value to the integral according to 
the following definition. 


DEFINITION 7-4 (improper integral due to infinity of integrand) Let the 
function f(x) be continuous throughout the intervals a< x << candc< x 
<b, and suppose that f(x) has a singularity at x = c in the sense that 
J (x) tends to infinity as x —> c. Then the integral of f(x) over the interval of 
integration [a, δ] is said to be improper, and it is defined to have the value 


c—E b 
I = lim f(x)dx + lim S(x)dx, 
e—>0 Ja 6—+0 Je+6 

whenever both limits involved exist. Under these circumstances the improper 
integral will be said to converge to the value 7. When either of the limits does 
not exist, the integral will be said to be divergent. If the point c coincides with 
an end-point of the interval [a, δ], then J is defined to be equal to the limit of 
the single integral for which the interval of integration lies within [a, δ]. 


On the basis of this definition we are now able to determine the value to 
be attributed to the improper integral used as an illustration above. Let us do 
this in the form of an example. 


Example 76 Evaluate the improper integrals: 


1.4 0 2 J 
(a) hh =| -- and (b) Io -| ( ᾿ ) dx. 
x --1 x . 


Solution The integrand 1/x? tends to infinity as x — 0, so that for case (a), 
when appealing to Definition 7:4, we need to make the identifications 
a= —1,b5=1,c =O and f(x) = 1/x?. Thus, 

πε 1 dx 


dx 
i, = lim = + lim 3 
e>0 J-1 X 6-0 Ji Χ 


Using the result of Problem 7-5 we find that 


[ 
I, = lim (= - ἢ + lim (-1 -+- ;) > 00. 


Thus the improper integral (a) is divergent. 

In case (b) the integrand is (x? + 1)/x?, which again tends to infinity as 
x— 0. However, in this case we must make the identifications a = —1, 
ὃ = 0, c = 0, and f(x) = 1 + 1/x?, so that this time the singularity in the 
integrand occurs at the right-hand end-point of the interval of integration 
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[—1, 0] (that is, at the upper limit of integration). 
It then follows from Definition 7:4 that 


eee a (: re =) dx, 
x 


e—0 J—1 


which, from the results of Problems 7:2 (Ὁ) and 7:5, becomes 


Io = lim Κ-- 4 1) + [: -- i} —> οὐ. 


e—0 


Hence the improper integral (b) is also divergent. 


The one remaining form of improper integral requiring consideration 
occurs when the interval of integration is infinite. In these circumstances we 
shall assign a value to the integral according to the following definition. 


DEFINITION 7:5 (improper integral due to infinite interval of integration) 
Let the function f(x) be continuous on the interval [a, 00), then the integral 
of f(x) over the interval of integration [a, 00) is said to be improper, and it is 
defined to have the value 


i 
Π = lim | f(x)dx, 
k-> Ova 
whenever this limit exists. Under these circumstances the improper integral 
will be said to converge to the value /;. When the limit does not exist, the 
integral will be said to be divergent. Similarly, if the interval of integration is 
(— οὐ, a], then when the limit exists, the improper integral of f(x) over the 
interval of integration (— ©, δ] is defined to have the value 


b 
Iz = lim F(x)dx. 
k—->o Jk 


Symbolically, these improper integrals will be denoted, respectively, by 
00 ὃ 
Ii -| Sf(x)dx and 7. -| S(x)dx. 


Example 77 Evaluate the improper integral 


l= { ο dx 
=) 8- 
Solution It follows at once from Definition 7-5 that 
kdx 
] = lim a 
k->wo 5 Χ 


so that by virtue of the result of Problem 7.5, 
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I = lim alee = 
ee k 3 “3 


Hence this improper integral converges to the value 1/3. 


7.3 Integral inequalities 


A number of useful inequalities may be deduced concerning definite integrals, 
the simplest of which has already been stated in Eqn (7-1). Let us now derive 
our first result of this type, of which Eqn (7-1) represents a special case. 

Suppose that the definite integrals of f(x) and g(x) taken over the interval’ 
[α, b] both exist. In brief, let us agree to say that f(x) and g(x) are integrable 
over the interval [a, b]. Now suppose that f(x) < g(x) fora< x < b. Then 
if Pm is a partition of [a, 6], we have from Theorem 7-2 that 


b b b 
{ g(x)dx — | f(x)dx = | (g(x) — f(x))dx 


n 


= lim ( (Ei) — f(§i)) Ai, (7-20) 


JAllp,,70 t= 


where ἔξ; is some point in the ith sub-interval of length A; generated by the 
partition Pm. Now since by hypothesis f(x) < g(x), it follows that f(é:) < 
2(&), so that the right-hand side of Eqn (7-20) must be non-negative. Thus 
we have proved the following theorem. 


THEOREM 7:4 (inequality between two definite integrals) Let f(x) < g(x) 
be two integrable functions over the interval [a, ὁ]. Then, 


[ f(x)dx <= ᾿ g(x)dx. 


Equation (7-1) follows as a trivial consequence of this result, for the 
theorem implies that if d(x) < f(x) < p(x) are three integrable functions 
over the interval [a, δ], then 


[ φί(χ)άχ -Ξ [ f(x)dx ΞΞ [ ψ(χ)άχ. 


Hence, if m, M are, respectively, the minimum and maximum values of f(x) 
on [a, δ], our required result follows by setting d(x) = m, p(x) = M, when 
we obtain 


b 
m(b — a)< { f(x)dx < M(b — a). (7.21) 


This last simple result implies a more important result which we now 
derive by appeal to the intermediate value theorem of Chapter 5. Writing 
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inequality (7:21) in the form 
] b 
m =. { f(x)\dx << M 


shows that the number 


: [ reoax 


ᾧ --- ὦ 
is intermediate between m and M which are extreme values of the function 
f(x) itself. Hence, provided f(x) is continuous, it then follows from the inter- 
mediate value theorem that some number & exists, strictly between a and b, 
such that 


1 ὃ 
Κῶ = δα Ϊ I (x)dx. (7:22) 


This result is called the first mean value theorem for integrals, and it 
constitutes our next theorem. 


THEOREM 7:5 (first mean value theorem for integrals) Let f(x) be con- 
tinuous on the interval [a, δ], then there exists a number &, strictly between 
a and b, for which 


b 
Ϊ soodx = (6 — afl). 


᾽ AF = F(x + ἢ) -- F(x) 
aS) 


O a x x+h b f 
Fig. 77 Area below y = f(t) as a function of the upper limit of integration x. 


7:4 The definite integral as a function of its upper 
limit-indefinite integral 

If the lower limit of a definite integral is held constant, but the upper limit is 

replaced by the variable x, then the numerical value of the integral will clearly 

depend on x. Another way of describing this situation is if we say that a 

definite integral with a variable upper limit x defines a function of x. In Fig. 

7:7 this idea is illustrated in terms of areas, with the shaded region marked 
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F(x) denoting the area below the curve y = f(7) which is bounded on the 
left by the line ¢ = a, and on the right by the line { = x. 
In terms of the definite integral we have 


f= | "f(t. (7-23) 


a 

Now let us suppose that f(t) 1s continuous in some interval [a, δ], with 
a<{x <b. Notice here that for the first time it is necessary to use the dummy 
variable t, because x and ¢ are fulfilling two different roles in Eqn (7-23). To be 
precise, x represents the upper limit of integration, whilst the dummy variable 
t represents the general variable in the interval of integrationa<ft< x. 

Consider the difference 

xzth x 
F(x + h) — F(x) = | f(Hdt — } f(dt 
«'΄ ὦ wi 


ath 


= | f(pdt. (7:24) 


Then the first mean value theorem for integrals allows us to rewrite Eqn (7-24) 
in the form 


F(x + h) — F(x) = hf, (7-25) 
where x < E< x +h. 
Now, forming the difference quotient {F(x + ἢ) — ΕΟ ἢ, we find 
F(x + h) — F(x) | 
hh 


so that taking the limit as ἢ --» 0 gives, 
ΕΣ +h)- 
h 


= f(s); 


F(x) = lim = f(x). : (7.26) 


This important result shows that the integrand of integral (7:23) at the 
upper limit of integration t = x 15 equal to the derivative of F(x) with respect 
to x. 

Suppose now that G(x) is any function for which G’(x) = f(x). Then, 


d 
G'(x) — F(x) == (G(x) — ΕΟ) = 0, 
and so from Corollary 5-12 

G(x) = F(x) + constant. | (7:27) 


Combining Eqns (7-23) and (7:27) shows that the most general function 
G(x) whose derivative is equal to f(x) must be of the form 


Goa = | " f(f)dt + C, (7-28) 


where C is a constant. 
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The first term on the right-hand side of Eqn (7:28) is called an indefinite 
integral. The function G(x) itself is called either a primitive of f or an anti- 
derivative of f. We shall usually use the name antiderivative, since this offers 
an accurate description of the process by which it is to be found. Namely, an 
antiderivative arises from the process of reversing the operation of differ- 
entiation, and the most frequent method of finding antiderivatives utilizes 
this idea by employing tables of derivatives in reverse. That is to say, by 
matching an integrand with an entry in a table of derivatives and thereby 
finding the functional form of G(x) apart from the additive arbitrary constant. 

Usually the antiderivative G(x) defined in either Eqn (7:27) or Egn (7:28) 
is written symbolically in the form 


J f@)dx = F(x) + C. (7-29) 


In this notation, the fact that an antiderivative is a function related to the 
operation of integration, and not just a number as in an ordinary definite 
integral, is indicated by again employing the integral sign, but this time without 
limits. On occasions the reader will find books in which an antiderivative is 
signified by the notation 


| * foodx, 


rather than the notation used in Eqn (7:29). 
The following short table lists a few of the antiderivatives which are of 
most frequent occurrence in mathematics. 


Table 7.1 


§ f(x)dx = F(x) +C 


f@) F(x) 
1 a (const) ax 
χῆτὶ 
2 x* 
π- 
3 οἷς 1 ett 
A 
sin x —cos x 
cos x sin x 


Other useful elementary antiderivatives that should be memorized, 
together with an account of systematic methods for finding antiderivatives, 
are given in the next chapter. 

Let us now return to Eqn (7:27) and notice that it follows from this that 
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G(b) — G(a) = F(b) — F(a) = Ε(δ) =| fooax (7:30) 
Hence we have proved that 


Ϊ “flodd = G(b) — Gta), (7:31) 


where G’(x) = f(x). This provides a method for the evaluation of definite 
integrals, for expressed in words it asserts that the definite integral of f(x) taken 
over an interval [a, b] is the difference between the value of any antiderivative 
of f(x) atx = band x =a. 

It is now time to express results (7-26) and (7:31) in the form of two basic 
theorems known, respectively, and the first and second fundamental theorems 
of calculus. 


THEOREM 7:6 (first fundamental theorem of calculus) If f(x) is continuous 
fora<. x <2 b, and 


F(x) =|" fat, 


then F(x) = f(x) for all points x in [a, 5]. 
Alternatively expressed, this result may also be written 


d (Ὁ | 
= Ϊ ϑαι = f(x). 


THEOREM 7:7 (second fundamental theorem of calculus) If f(x) is con- 
tinuous for a< x < ὃ and G(x) is any antiderivative of f(x), then 


{ " f()dt = GO) — Ga). 


The statement of Theorem 7-7 is often written in the form 


[ " fddx = Ga) [222 


with the understanding that 
G(x)|2=2 = Gb) — Gea). 


It follows from Theorem 7.7 that the definite integral calculated so 
laboriously in Example 7-1 may be evaluated directly by appeal to entry 
number 2 in Table 7:1. To see this set n = 2, so that f(x) = x2, then F(x) 
= x3/3, and by Theorem 7:7 we immediately deduce that 


b 
[ x2 dx = 1(68 -- a3). 
ow 
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The systematic employment of the fundamental theorems of calculus will 
be taken up in detail in Chapter 8, since our concern here is primarily with 
the theory rather than the practice of integration. 

Finally, to emphasize that the indefinite integral is a function, we now 
give an example of such an integral which defines an important mathematical 
function. Since we have the relationship 

d 


lo ee for > 0 
dx Be χ᾽ ὼ ᾿ 


it follows from Theorem 7:7 that, provided a > 0, 


* dt 
[ rn loge x — loge a. 


Hence, setting a = | gives the result 


* dt 
loge x = : Ἐπ (7:32) 


which is illustrated as the shaded area in Fig. 7°8. 


Fig. 7-8 Natural logarithm represented as an area. 


O Ι ἢ 


7.5 Differentiation of an integral containing a 
parameter 


It can sometimes happen that an integrand, in addition to being a function of 
x, also depends on a parameter «. Furthermore, the upper and lower limits 
of the integral may themselves be functions of « so that the value of the 
integral must then itself depend on «. Our concern in this section will be with 
the differentiation, with respect to a, of an integral of the form 


p(x) 
I(x) = [ . flx, wdx. (7-33) 


To derive the form of our result let us begin by assuming that 4(«), p(«) 
are differentiable functions with respect to « in some interval c <a -Ξ ὦ, 
and that f(x, «) is both integrable with respect to x on the interval [¢(«), y(«)] 
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and differentiable with respect to «. Then, first notice that from the mean 
value theorem for derivatives, in 9 -ΞΞ α - ἡ -ΞΞ ὦ, we have 


(a +h) = φῶ +h (S), witha <E<ath; 
g 
dy 
wa +h) = y(a) +h (3*) , withe<yx<ath; (7:34) 
of] 


fx, a +h) = flx, Ὁ +n(Z) ν᾿ witha <C<a+h. 


The partial derivative notation is needed in the last of these results because 
for this application of the mean value theorem for derivatives we are regarding 
the variable x as a constant. 
Now we have 
y(ath) 


I(a + h) -| f(x, « + Ajdx, 


φία + i) 
so that using results (7:34) we find 


yalthy 


ey" v(x) 
Ka + ἢ = ! f(x, a + A)dx + Ι fx, a + A)dx 
d(x) 


w(x) 
φί(α) 

+ Ϊ f(x, « + A)dx. 
φί(α) -᾿ hd’ 


An application of the mean value theorem for integrals (Theorem 7:5) to 
the first and last terms then shows that 


Ka + hy = h(S ) fee ἜΝ F(x, « + h)dx 


4 


aay, (<=) fix", « +h), 


where y(a) << x’ < ψί(α) + hy’, O(a) < x” < φ(α) + Ad’. 
Next, forming the difference 7(α + h) — J(a), combining integrals and 
using the final result of (7:34) gives 


ψία) ( of 


- - dx 
(a) τ [4 


-,( ‘) f’,a+h). (735) 


Ια + A) -- 1) -- (S* ie) fe ath)+h Al" 


Finally, forming the difference quotient {/(« + Δ) — I(«)}/A and taking 
the limit as ἢ —> 0 it follows that &, 7, and ζ all tend to «, whilst x’ tends to. 
w(a) and x” tends to ¢(a), whence 
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ἐξ (2) 99,0) - (#) 4.04 (0 Lax (7.36 


(a) θα 


THEOREM 7:8 (differentiation of an integral containing a parameter) Let 
φί(α), y(a) be differentiable functions with respect to « in some interval 
c<a<d, and let f(x, «) be both integrable with respect to x over the 
interval f(a) < x < (a) and differentiable with respect to «. Then, 


f(x, ®dx = (52) pov i ( “ ὴ γῴ, 5 +f Las 


= p(x) 


φ(α) 


A useful special case of this arises when φ(α) = a and y(«) = b are con- 
stants, so that the only dependence on the parameter « is through the inte- 


d d 
grand f(x, «). The terms δι and τς are then identically zero, so that we 
α α 


arrive at the following corollary. 


Corollary 7.8 If f(x, «) is both integrable with respect to x over the interval 
[a, δ] and differentiable with respect to «, then 


d b b of 
re ] I(x, a)dx = | on dx. 


Example 78 Apply the results of Theorem 7:8 to the following integral: 
3+2 sin 3a dx 


WY cng 
(2) Jl+cosa X% + a2 


Solution If we make the identifications ¢(a) = 1+ cosa, ψία) =3 + 
2 sin 3a, and f(x, «) = (x? + αξ7 1, it then follows directly from Theorem 
7:8 that 


ἀ] 6 cos 3α sin ἃ { δα. dx 
1 


da (3 + 2sin 3a)? + a2 (1+ cosa)?+ a? 2 teosa (x2 + a2)? 


7.6 Other geometrical applications of definite integrals 


This section offers a brief discussion of the application of the definite integral 
to the determination of arc length for plane curves, the surface area of a 
surface of revolution, and the volume of a volume of revolution. Each result 
will be derived by appeal to the basic definition of a definite integral, since it 
will first be necessary to define the precise meaning of the concepts that are 
involved. 
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O a=x0x, x2 Need. Xe =o 


(a) (b) 
Fig. 7-9 (a) Arc length of curve; (Ὁ) element of arc length. 


(a) Arc length of a plane curve 


Consider the plane curve Τὶ with the equation y = f(x) illustrated in 
Fig. 7-9 (a). Then our task here will be first to define the meaning of the 
length s of the arc MN, and then to deduce a method by which it may be 
found once the equation of I‘ has been given. Let Qo, Qi, . . ., Qn represent 
any set of points on I’, the first of which coincides with the left-hand end- 
point M, and the last of which coincides with the right-hand end-point N. 
Then if As; denotes the length of the chord joining Q;-1 to Qj, the length Sy 
of the polygonal line joining M to N is © 


n 
Sn = > As;. 
i=1 
Now the projection of the set of points Qo, G1, . . ., Qn onto the x-axis 
defines a set of points a = x9 < x1 <<... . < Xn = 6 which form a partition 


P,, of the interval [a, δ]. Thus, denoting the norm of P, by || A ||p,, we shall 
define the length s of the arc I’ from M to N to be 


s= lim > As. (7:37) 


IAllp, 0 i=1 


Now, setting A; = x; — x;-1 and ὃ; = f(x) — f(%-1), it follows directly 
by an application of Pythagoras’ theorem (Fig. 7.9 (b)) that 


As; = V(Ae + 677) = JC + (=) ) Ai. 


However, by virtue of the mean value theorem for derivatives we may write, 
provided that f(x) is differentiable on [a, δ], 


δ᾽ f(x) —fi-r) _ ,, 
i ee {(ξὺ, 


where χορ « δι - x;, and so 


As; = ν(! + [{{ξ}}]5) Δι. (1:38) 
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Thus the desired arc length s will be determined by evaluating 


n 


s= lim vil + [f'(Ei)]}?) Δι. (7:39) 


\|Allp, +0 i= 


We see from Definition 7-2 that this is simply the definite integral of the 
function +/(1 + [f'(x)]?) integrated from x = a to x = ὁ, and hence 


s= [ Jf + Lf'(x)]2)dx =[/( + (=) Jax. (7-40) 


THEOREM 7:10 (arc length of plane curve) Let y = f(x) be a differentiable 
function on the interval [a, δ]. Then the length s of the plane curve I defined 
by the graph of this function in the (x, y)-plane between the points (a, f(a)), 
(ὁ, f(b)) is given by 


b d 9 
a dx 
Example 7-9 Determine the length of arc of the curve y = cosh x between the 
points (1, cosh 1) and (3, cosh 3). 


Solution We have a=1, 6=3, y=cosh x, and so dy/dx = sinh x, 
whence 


3 3 
s=[ /(1 + sinh? x) dx= { cosh x dx. 
1 


1 


Now since d/dx (sinh x) = cosh x, it follows that sinh x + C is an anti- 
derivative of cosh x, so that by Theorem 7:7 we have 


8 
5 -| cosh x dx = (sinh x + C)|? = sinh 3 — sinh 1. 
1 


y= y(t) Ss as eae vets a ae eae B(t = T1) 


O a ~ x= φ(ὴ B 


Fig. 7-10 Length of parametrically defined curve I’. 
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Theorem 7-10 will fail for curves Γ᾽ of the type shown in Fig. 7-10, for 
any representation of the function in the form y = f(x) will not be single 
valued on the interval [«, 6], and so it will not be differentiable there. 

The difficulty here is easily overcome by using the fact that each point on 
the curve [" can be uniquely defined and a unique derivative assigned if the 

curve I is capable of parametric representation in the form 


χ τῷ φ(ἢ, y=yt) for Tox t<Th, (7:41) 


with φ(), y(t) differentiable on [70, 71]. 
Using the result for parametric differentiation 


dy _ v@ 
[=] Ξ $0) 


in Eqn (7-39), and then employing the differential relationship A; = ¢’(t)At 
to define A; in terms of At, we find that 


+ afin δ (+ [eal ) #6008 4 


where ti-1 < δι < ἢ. 
Thereafter, the argument that gave rise to Eqn (7-40), now gives rise to 


ia {{ i Eel )¢ (dt = i VISOR + yO) dt. (743) 


THEOREM 7:11 (arc length of parametrically defined curve) Let 4(t), y(t) 
be differentiable functions in ΤῸ << t< 7). Then the length s of the plane 
curve defined parametrically by x = φ(), y = y(t) between the points 
(d(T), WT 0); (φ( }), ψίΤ 1)) Is given by 


T1 
i i} VIGOR + y'OR) dt. 
Το 


(Ὁ) Area of surface of revolution 


The name surface of revolution is given to any surface which is generated by 
rotating a plane curve y = f(x) about either the x-axis or the y-axis. Since 
the determination of the area in either case is exactly similar, we shall discuss 
only the case of the revolution of the curve y = f(x) about the x-axis, as 
shown in Fig. 7-11. 

A problem arises here as to how to define the area of a non-cylindrical 
curved surface. We propose to approach the problem by sectioning the surface 
into annular strips of width A; as shown in Fig. 7-11, and then to approximate 
the area AS of each such annular strip by representing it by the conical area 
which is obtained by rotating the chord PQ of length As; about the x-axis. 
Then if this element of area of cone between the planes x = χε." and x = x; 
is AS;, this will be given by 


CH 7 


Fig. 711 Area of surface of revolution. 


Need {55.5.5} On (7-44) 


Similar elements of area may be defined for each of the other annular 
strips defined by some partition P, of the interval [a, δ] by the set of points 
a=xX9<xX1<°++< x, = δ. Thus, denoting the norm of Py by [[Δ|}».,» 
we shall define the area S of the surface of revolution generated by rotating 
y =f(x) about the x-axis, and contained between the planes x = a and 
x = b, to be 


S= lim YAS = lim πὸδᾷίνε: + ydAsi. (7-45) 
Alp, 0 1 NAllp, +0 {ξὶ 


Hence, if f(x) is differentiable in a< x < δ, by using result (7-38) we find 


S= lim σε Vl +tlfEoDAs (7-46) 


[| A||Pn-O t=1 


where xi-1 - & - χι. 

Once again our previous form of argument shows that this is just the 
definite integral of the function 27 f(x)\/U. + Lf'(x)]°) integrated from x = a 
to x = b, and so 


$= 29 [ Λῶνα + FOP dx (7-47) 
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THEOREM 7°12 (area of surface of revolution) Let f(x) be a differentiable 
function on a< x < δ. Then the area S of the surface of revolution generated 
by rotating the graph of the function γ᾽ τε f(x) about the x-axis, and contained 
between the planes x = a and x = bis given by 


b 
5 -- 26 [κῶνα + POOP) ax 


Example 710 Find the area contained between the planes x = —1 and 
x = 2 of the surface of revolution about the x-axis of the curve y = cosh x. 


Solution Wehave a = —1, ὁ = 2, and f(x) = cosh x, and so f(x) = sinh x, 
whence 


2 2 
S = 27 Ϊ cosh χυ(] + sinh? x) dx = 2π Ϊ cosh? x dx. 
--Ἰ -1 


To evaluate this result we now use the hyperbolic identity cosh? x = 4(1 + 
cosh 2x) to obtain 


2 
S= “| (1 + cosh 2x)dx. 
—1 


Then, as it is easily verified that 3 sinh 2x + Cis an antiderivative of cosh 2x, 
we have from Theorem 7-7 that 


2 
S= 7| (1 -Ἐ cosh 2x)dx = a(x + 4 sinh 2x + 6): 
ay 


= $n(6 + sinh 4 + sinh 2). 
(c) Volume of revolution 


Finally, let us determine the volume of revolution V of the volume shown in 
Fig. 7:11. This time, to define the volume of such a figure, we consider 
cylindrical elements of volume of thickness A;, and place upper and lower 
bounds on that element of volume by the obvious inequality: 


a X (least radius of annulus)? x A; < element of volume < 
a X (greatest radius of annulus)? x Aj. 


Then, if xi-1 < & < x;, a volume element AV; satisfying this inequality and 
bounded to the left by the plane x = x;-1 and to the right by the plane x = x; 
is 


AV, = w[f(é:) J? Ai. (7-48) 
The volume of revolution generated by rotating y = f(x) about the x-axis, 


and contained between the planes x = a and x = ὁ will then be defined to be 


V= lim oS (f(D: (7-49) 


JAl|py+0 i=1 
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A repetition of the previous form of argument then yields 


b 
V mm [ Lf}? dx. | (7-50) 


Notice that we have imposed no differentiability requirements on f(x), 
so that result (7:50) is applicable even if f(x) is only piecewise continuous. 


THEOREM 7:13 (volume of solid of revolution) Let f(x) be a piecewise 
continuous function on a< x < δ. Then the volume of the solid of revolu- 
tion generated by rotating the curve y = f(x) about the x-axis, and contained 
between the planes x = a and x = 5, is given by 


b 
V=q [ Lfx)]2 dx. 


Example 7:11 Determine the volume of revolution generated by rotating the 
parabola y = 1 + x? about the x-axis, and contained between the planes 
x = land x Ξ 2. 


Solution Here we have a = 1, ὁ = 2, and f(x) = 1 + x?, so that 


2 2 
γε τ! ( +298 dx =a | (1 + 2x2 + x4)dx 
1 1 


7:7 Numerical integration 


From the second fundamental theorem of calculus we have seen that the 
successful analytical evaluation of a definite integral involves the deter- 
mination of an antiderivative of the integrand. Although in many practical 
cases of importance an antiderivative can be found, the fact remains that in 
general this is not possible and Theorem 7:7 is therefore of no avail. Such, 
for example, is the case with an integral as simple as 


S 2 
{ e-*" dx, 
1 


for although an antiderivative of ε΄ τῇ certainly exists on theoretical grounds, 
it is not expressible in terms of elementary functions. 

Of the many possible methods whereby a numerical estimate of the value 
of a definite integral may be made, we choose to mention only the very 
simplest ones here. The general process of evaluating a definite integral by 
numerical means will be referred to as numerical integration, though the old 
fashioned term numerical quadrature is still often employed for such a 
process. The mafter of the accuracy of these methods will be taken up 
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elsewhere in connection with applications of Taylor’s theorem. 


O a= Xo XN} X2 Xn-1 Xn = b 


Fig. 712 Trapezoidal approximation of area. 


(a) Trapezoidal rule 


Although a strictly analytical derivation of the so called trapezoidal rule 
for integration may be given we shall not use this approach, and instead 
make appeal to the area representation of a definite integral. Consider Fig. 
7:12, and let us estimate the shaded area below the curve y = f(x) which we 
know has the value 


[rose 


Let us begin by taking any set of n+ 1 points ἃ Ξξξ χὺ “ χὶ - - τ’ 
< Xn = δ, and on each interval [xi-1, xi], approximate the true area above 
it by the trapezium obtained by replacing the arc of the curve through the 
points (xs-1, f(x+-1)), (xi, f(s) by the chord joining these two points. 

Then the area of the trapezium on the interval [x;~1, x;] is 


ἐ(ζοαι 1) + f(x) ) Ax, 


where Axi = Xi — Xi-1.- 
Thus, adding the n contributions of this type, we arrive at the general 
trapezoidal rule 


b 
] J (x)dx Ἂν 3(f (x0) + f(x1))Ax1 “bt 4(f(x1) + f(x2))Axe Had vise 


τ $(f(Xn-1) + Χο) Arn. (7°51) 


If the interval [a, 5] is divided into πὶ equal parts of length k = (ὁ — a)/n, 
then (7:51) becomes the trapezoidal rule for equal intervals 


ὃ 
{ Fen ere Pee eve eee eee ee 
+ 4flen)} + εἰ), (7.52) 


τ 
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where an equality sign has now been used because we have included the 
error term e(h), which recognizes that the error is, in part, dependent on the 
magnitude of h. 


(b) Simpson’s rule 
A different approach involves dividing [a, Ὁ] into an even number n of sub- 
intervals of equal length ἡ = (ὁ — a)/n, and then approximating the function 
over consecutive pairs of sub-intervals by a quadratic polynomial. That is to 
say fitting a parabola to the three points (a, f(a)), (a+ A, f(a + h)), 
(a + 2h, f(a + 2h)) comprising the first two sub-intervals, and thereafter 
repeating the process until the whole of the interval [a, b] has been covered. 
The value of the definite integral can then be estimated by integrating the 
successive quadratic approximations over their respective intervals of length 
2h and adding the results. This simple idea leads to Simpson’s rule for 
numerical integration which we now formulate in analytical terms. 

Consider the first interval [a, a + 2h], and represent the function y = f(x) 
in this interval by the quadratic 


)7 ΞΞ 0 + 1x + cox?. (7°53) 


Then the approximation to the desired integral taken over this interval is 


a+2h 


at+2Zh 
/(x)dx ~| (Co + c1X + cox?)dx 


a+2h 


C1x2 (οχϑ 
= (cox + —— + — 
a 


5 3 (7-54) 


To determine the coefficients co, ci, and c2 in order that the quadratic should 
pass through the three points (a, f(a)), (a + ἢ, f(a + h)), (a + 2h, f(a + 2h)) 
we must solve the three simultaneous equations 


fla) = co + ca + coa?, 
f(a +h) = co + c1(a + A) + cca + A), 
f(a + 2h) = co + c1(a + 2h) + co(a + 2h)?. (7:55) 


When this is done and the results are substituted into Eqn (7:54) we arrive at 
the desired result 


a+2h 
f@)dx = εἶ fla) + 4f(a + A) + f(a + 2hA)) + ε(}), (7-56) 


where again we have included the error term by e(A). In its simplest form 
Eqn (7-56), together with its error term, 15 called Simpson’s rule. An explicit 
form for e(h) in both the trapezium rule and Simpson’s rule will be given 
later. 

If, now, result (7-56) is applied to the intervals [a,a + 2h], [a + 2A, 
a+ 4h],..., [a+ (a — 2)h, 6] and the results are added, we arrive at 


~ 
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Simpson’s rule for an even number ἢ of intervals 


[ f(x)dx = (f(a) + 4f(a + ἢ) + 2f(a + 2h) + 4f(a t+ 3h) Ἐ - τ᾿ 


+ 4f(a + (n — Ih) + f()] + εὐ, (757) 
where A = (ὁ — a)/n. 


Example 7112 Calculate the definite integral 


2 dx 
T= — 
1x 


by the trapezoidal rule and by Simpson’s rule, taking ten integration steps o 
length h = ΟἹ. 7 


Solution We start by tabulating the functional values of the integrand 1/x 
at intervals of 0-1. 


1 
x {®) = = 
1-0 1-0000 
1-1 0-9091 
1:2 0:8333 
1:3 07692 
1.4 0.7143 
1:5 0:6667 
1:6 0-6250 
1:7 0:5882 
1-8 0:5556 
1-9 0:5263 
2:0 0-5000 


Then, using the trapezoidal rule (7-52), we find 


I x01 x [0-5000 + 0-9091 + 0-8333 + 0-7692 + 0-7143 + 0-6667 
+ 0:6250 + 0-5882 + 0-5556 + 0:5263 + 0-25], 


whence 7 ~ 0-6938. 
The same calculation using Simpson’s rule, (7:57), gives 


0-1 
ls aT x [10000 + 4 x (0-9091) + 2 x (0-8333) + 4 x (0-7692) 


+ 2 x (0:7143) + 4 x (0:6667) + 2 x (0-6250) + 4 «(0-5882) 
+2 x (0-5556) + 4 x (0-:5263) + 0-5000], 
whence 7 ον 0-6932. 
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In actual fact the exact result of this definite integral is loge 2 = 0-69315. 
As would have been expected on intuitive grounds, Simpson’s rule is more 
accurate than the trapezoidal rule. 


(c) Integration of interpolating polynomials 


A direct extension of the previous method that may be exploited system- 
atically to produce integration formulae of high accuracy and flexibility 
involves the replacement of the function y = f(x) over the interval [a, δ] by 
an interpolating polynomial of degree n. Thus, on the interval [a, 6], the 
function y = f(x) is represented by 


Y= Co + CiX + οὐχ +e + + Cnx", (7-58) 
and the numerical integration formula then follows by writing 


ὃ ὃ 
{ f(x)dx ~ | (co + cix + Cox® ++ + > + Cyx™)dx. (7:59) 
a a 


Thus, if the error term is again represented by e(h), we obtain the numerical 
integration formula 


b 
[ feoax = colt — a) +S (08 — a) +2 (68 — 08) +. i 


Cn 


n+ 1 


+ (bv+t — 4511) + e(h). (760) 

The difficulty in this approach arises from the fact that the sense in which 
Egn (7:58) is to approximate y = f(x) is still to be defined, and this will 
influence both the method by which the m + 1 coefficients co, c1, . . ., Cn 
are to be determined and, naturally, the error term «(A). 

Probably the simplest choice of approximating polynomial, and the only 
one to be discussed here, is determined by the requirement that the poly- 
nomial and the function should have identical values at n + 1 points 
Xo << X1<-+ + + < Xn belonging to [a, δ]. That is, the requirement that the 
graph of Eqn (7-58) should pass through the n + 1 points (xo, f(xo)), (x1, f(x1), 

. .» (Xn, f(Xn)). Such a polynomial is called a Lagrangian interpolation 
polynomial, and its form may be written down directly as follows. We illus- 
trate the Lagrangian interpolation polynomial L(x) of degree 3, which 
passes through the four points (xo, f(xo)), (x1, /(x1)), (x2, f(x2)), and 
(xs, f(xs)). Higher degree polynomials may be constructed in a similar 
manner. | 


(x — x1)(x — xe)(x — Xs) 


La(x) = (xo — χι)ίχο — x2)(xo — xs) 


(Xo) 
(x — xo)(x — x2)(x — xs) 
(x1 — χο)ίχι — X2)(x1 — x3) 


f(x) 
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(x — xox — x1)(x -- 3) 
(x2 — Xo)(x2 — x1)(x2 — Xs) 


f (x2) 


(x — xo)(x — x1)(x — x2) 
(x3 — Xo)(x3 -- X1)(x3 — Xe) 


J (x3). (7-61) 


This form of approach to the development of an integration formula 15 
essential when, as is often the case, the function f(x) is only known in tabular 
from. 


Example 713 Given the following tabular values of a function f(x), derive 
the Lagrangian interpolation formula L3(x) for f(x). 


r Xr f(x?) 


0. 2 2:13] 
] 4 1:242 
ys 6 4-507 
3 7 9.702 


Solution It follows by divest substitution into Eqn (7-61) that 
(x — 4)(x — 6)(x — 7) 
----.---. ---.-. - 2.131 
45. 
(x — 2)α — 6)(x — 7) 
(2)(—2)(—3) 
(x — 2)(x — 4\(x — 7) 


L3(x) = 


x (1-242) 


4-507 
OoepD° 7 ee 
(x -- 26 - 4x-6) 
(5)5)) tees 


Simplification of this will yield the required third degree polynomial 
which may, if desired, then be integrated over any sub-interval of the interval 
[2,7] on which f(x) is defined, thereby yielding an approximation to the 
definite integral of f(x) integrated over that same sub-interval. 


PROBLEMS 


Section 7:1 


71 Let f(x) = 4x on some closed interval a < x < ὁ lying in the positive part of 
the x-axis, where 4 > 0 is a constant. Then, if Pn is a partition of [a, δ] into n 
sub-intervals of equal length, determine the form of the lower and upper sums 


338 / FUNDAMENTALS OF INTEGRATION CH 7 


Sp, Sp, for f(x) taken over this partition and prove directly by taking the 
limit that 


lim Sp, = lim Sp,. 


n> © προ © 


Hence deduce that 
ὃ λ 
{ Ax dx = - (δὲ — a?). 
i 2 


72 Let A, μ > 0 be constants, and set f(x) = + Ax on some closed interval 
a <x <b lying in the positive part of the x-axis. Show, using the method of 
Problem 71, that 


b A 
; (μ + Ax) dx = μ(ὸ — a) + 5 (ὁ — a?). (A) 
a 
Show also by this method that 
b 
| udx = μ(ὸ — a), (B) 
; | 


and deduce from (A), (B) and the result of Problem 7:1 that 


b b b 
[w+ νὴ ἀκ τ | nar τ] Ax dx. 
a& a a 


This provides a direct proof of the linearity of the operation of integration in 
the special case that f(x) = uw + Ax. 


7.3 Let f(x) = e**, and take Pn to be a partition of the closed interval [a, δ] into 
n sub-intervals of equal Jength. By taking the numbers &; of Definition 7-1 to 
be at the left-hand end points of the sub-intervals, compute the approximating 
sum Sp, corresponding to f(x) = e””, and by finding its limit prove that 


ὃ 1 
[ εὖ dx = (εἶδ — @2), 
a 


74 Ifa<k < b, use the result of Problem 7:3 to deduce that 


b k b 
[ οὖν dx = Ϊ οἷ; dx + [ e*” dx. 
a a k 


This provides a direct proof that the operation of integration is additive with 
respect to the interval of integration in the special case that f(x) = εὖς͵ 


7.5. Let [a, δ] be any closed interval not containing the origin, and denote by Pm 
the partition of this interval into 7 equal sub-intervals each of length (6 — a)/m. 
Denote by x, the point x, = a + (r/m)(b — a) lying at the right-hand end point 
of the rth interval. Then, by setting ἔν = /(xr-1xr) show, by considering 
Xr-1 — € and x, — ξ,, that xr-1 « & < xr+1. By writing f(x) = 1/x? in 
Definition 7.2, and taking Pm and the points &, in that definition to be as 
defined above, prove that 


dx (1 1 
πὰ »} 
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ine: Use the fact that ΡΣ ar as > Xe — χρη] χα Xr 


f=] 


7-6 Determine the lower bounds m, and the upper bounds Μ; of the function 
f(x) = [ΚΙ + x?) in each of the n adjacent sub-intervals of length 1/n com- 
prising a partition Pn» of the closed interval [0, 1]. Use these results to deduce 
the form taken by the upper and lower sums Sp,, Sp, and show that 


lim (Sp, — Sp,) = 9. 


γ.-τν Ὁ 


Deduce from this that 


ee a (ὦ. Cer ως δ 
{ arin n2+ 1% n%+ 25. n? + 3? μὲ + n® 


or, equivalently, 


I 1 1 
=limn G+ πὴ ‘oe 


We shall see later that this integral has the value 47, and so each of these 
different expressions has this same interesting limit. 


Section 7:2 
7-7 Outline the proofs of the results of Theorem 7:3. 
7.8 If f(x) = 2x — 3, use result (A) of Problem 7:2 to evaluate the definite 
integral 
4 
| (2x — 3)dx. 
- 
Rewrite this as the sum of two definite integrals each with a non-negative inte- 
grand and verify that their sum leads to the same result. 


79 Use the result of Problem 7:3 to evaluate the definite integral 


2 
Ϊ e-3 dx. 
4 


7:10 Find the area J between the curves y = x? + 2 and y = —x + 1, which is 
bounded to the left by the line x = —1 and to the right by the line x = 2. 


7-11 Discuss, without attempting to evaluate any integrals that are involved, the 
problem of determining the area between the curves y = 1 + sin x and y = 1 
+ cos x which is bounded to the left by the line y = 0 and to the right by 
the line y = 27. 


7:12 Find the area J between the two curves y = 1/x? and y = e95 — 3, which is 
bounded to the left by the line x = 1 and to the right by the line x = 2. . 


7-13 Evaluate the integral 


I= [40 dx, 
0 
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given that 
x for O<x<l; 
f(x) =(\2+2x for 1<x< 2; 
x-1 for 2<x<3. 


7-14 On the assumption that the definite integral 
eee ΕΣ ς-- arcsin 6 — arcsi 
, VE x) ia 


prove that the improper integral 


I= 1 dx 
υ ν( -- x2) 
is convergent, and determine its value. 


7.15 Sketch the area bounded below by the positive x-axis, and above by the line 
y = x on the interval Ὁ < x < 1, and by the curve y = 1/x? on the interval 
I < x < , Determine this area J by the use of an improper integral combined 
with elementary geometrical arguments. 


Section 7:3 


716 Use Theorem 7:4 to place bounds on the value of the definite integral 
ἀπ 
[= i) e-*” cos? x dx, 


7:17 Evaluate the definite integral 


3 
Ϊ x? dx, 
= | 


and use the result to determine the number ¢ in Theorem 7:5 when it is applied 
to this definite integral. Is the number & unique? Repeat the argument, but 
this time applying it to the definite integral 


᾿ 
| x2 dx. 
2 


Is there a unique number ξ in this case? 


7:18 Prove the following result which is a restricted form of the second mean value 
theorem for integrals. Let f(x) > 0 be continuous and monotonic decreasing 
on [a, δ], and let g(x) = 0 be continuous on [a, ὁ]. Then, 


b ξ 
Ϊ f@) g(x)dx = f(a) ἢ & )άχ, 


where a « & < ὃ. State the corresponding form of the theorem when f(x) > 0 
is continuous and monotonic increasing on [a, 6]. [Hint: Consider the inte- 
grand f(a){ f(x) g(x)/f(@)} and use Theorem 7-4.] 


7:19 The requirement of continuity for f(x) in Theorem 7:5 is essential, for without 
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it the result of the theorem may, or may not, be true. Illustrate this by con- 
sidering step functions (x) defined on the interval [], 4], and show that it 15 
possible to define ones for which, 


(a) no number £ exists which satisfies Theorem 7:5; 
(b) an infinity of numbers § exist satisfying Theorem 7:5. 


Section 7 ‘4 


7:20 Use Theorem 7:7 to evaluate the following definite integrals: 


ὃ π Qn 
(a) | (x5/2 + 3e7)dx, (Ὁ) | sin x dx, | sin x dx, 
a 0 0 


2π 
(d) { | sin x | dx. 
0 


7:21 Use Theorem 7-7 to determine the area contained between the x-axis and the 
curve y = 1 + x3 + 2 sin x, which is bounded to the left by the line x = 0 
and to the right by the line x = 7. 


7:22 Using the basic properties of the logarithmic function listed in Section 6:3, 
express loga x in terms of an indefinite integral, and sketch the interpreta- 
tion of the result as an area below a curve. 


Section 7:5 
7:23 Apply Theorem 7:8 to the following integral, but do not attempt to evaluate 
the result: 


1+a? 
I(a) = Ϊ e-7 cos ax dx. 
a 


7:24 Apply Theorem 7-8 to the following integral, but do not attempt to evaluate 
the result: 


Ka) = [ x4-le- dx, {a> 0). 
0 


7:25 This problem outlines an alternative form of proof for the result of Theorem 
7.8. It is based on the chain rule for differentiation and on a direct proof of 
Corollary 7-8. Define the function F(«, ¢(«), y(«)) by the equation 


ψί(α) 
F(a, φ(ο), pe) = { f(x, a)dx. 
P(x) 
Then it follows from the chain rule for differentiation that the derivative of 
the integral with respect to « is given by 
dF _ oF ὉΕ dy ΒΡ dé 
dx Ga dy da oa da (A) 
Use Definition 7-3 together with the first fundamental theorem of calculus 
to prove that 
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ΤΣ ῖψ(α), «] and cat —f{d(~), «]. (B) 


Finally, obtain the statement of Theorem 7:8 by substituting results (B) into 
(A) and giving a direct proof, as in the text, that 


es ἢ 
ad Ι fix, αγάκ = [ L ax, 
δα Ἂ . Co 


where for the purposes of partial differentiation with respect to α, the limits ¢, y 
are to be regarded as constants. 


Section 7.6 


7-26 Express in terms of a definite integral the arc length of the curve y = 1 + x? 


7:27 


7:28 


7:29 


7:30 


7.31 


+ sin 2x, that lies between the points on the curve corresponding to x = | 
and x = 4. 


Prove that the circumference of a circle of radius a is 27a by using the para- 
metric equations of a circle x = acost, y=asint withhO <t < 27. 


Find the area contained between the planes x = —2 and x = 3 of the surface 
of revolution about the x-axis generated by the curve y = 2 + cosh x. 
[Hint: An antiderivative of cosh x is sinh x + C.] 


If the curve y = f(x) has an inverse x = ¢(y), state the form taken by Theorem 
7-12 when the curve y = f(x) between the points (a, f(a)) and (ὁ, f(d)) is 
rotated about the y-axis. 


Determine the volume contained between the parabola y = 2 + x + x? and 
the cubic y = 5 + 2x + x, which lies between the planes x = 1 and x = 2. 


If the curve y = f(x) has an inverse x = ¢(y), state the form taken by Theorem 
7-13 when the curve y = f(x) between the points (a, f(a)) and (ὁ, f(5)) is 
rotated about the y-axis. 


Section 7:7 


7.32 Evaluate the definite integral 


3 
| (x3 + 2x + ᾿)άχ 
1 


by the trapezoidal rule using four intervals of equal length and then by 
Simpson’s rule for the same intervals. Compare the result with that obtained 
by direct integration. Infer from your result that Simpson’s rule is exact for 
cubic equations despite the fact that it is based on a parabolic fitting of the 
function. 


7:33 State the form of the Lagrangian interpolation formula Lex), and use it to 


deduce Simpson’s rule, (7:56), by applying it to the three points (a, f(a)), 
(a + ἦι, f(a + h)) and (a + 2h, f(a + 2h)) through which the function y = 
f(x) passes. 
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7:34 Let the curve I’ be defined in terms of the polar coordinates (r, θ) by means 


7:35 


of the equation 


r= f(9), 
where f(9) is a continuous function. Then if Pn is a partition of the interval 
a<6< into the points « = 6) < θὲ «- : - - « 6, τ with the norm 


|| A ||p,, prove that the area A between the origin and the curve I’ which is 
bounded by the radius vectors 6 = « and 9 = β is given by 


n 
A= lim > 4f?(&) Ai, 
|Allp,—-0 ἐξὶ 


where 4;-1 < & < 6; and A; = 0; — 6;-1. Hence deduce that 


B 
A 1| f2(8) 49. 


Use this result to find the area swept out by the radius vector drawn from 
the origin to the Archimedian spiral r = k® between the radius vectors θ = « 
and 0 = f, with 8 > x. 


Consider a straight rod of length Z which has a uniform cross-sectional area. 
Aligning the x-axis with the rod in such a manner that the origin coincides 
with the left-hand end point, assume that the mass M(x) of material contained 
in the rod in the interval [0, x] is given by 


M(x) = [ p(t) de. 
0 


Then the essentially non-negative function p(x) is called the /inear density 
distribution of the matter in the rod, and by the first fundamental theorem of 
calculus it follows that p(x) = M’(x). 

Now in mechanics the moment of inertia I about an axis of a point mass m 
situated at a perpendicular distance x from that axis is defined to be mx?. By 
considering a partition P, of - x < L into the points0 = x» < x1 <<: τ’ 
< χῃ = L with the norm || A ||p,, prove that the moment of inertia J of the 
rod about an axis perpendicular to the rod and passing through an end point 
is given by 

7 
T= lim D> &2p(&) Az 
All p,--0 i=1 


where xi-1 « & < x; and A; = x; — χει. Hence deduce that 


DL 
I -{ x? p(x) dx. 
0 


In the case of a rod of mass M having a uniform linear density p(x) = po, 
deduce the relationship between po and M and use it to prove that the moment 
of inertia of the rod about an axis perpendicular to its length and passing 
through an end point is 


ML? 
τ᾿ 
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7.36 Consider a circular disk of radius a, and suppose that the mass M(r) of material 
contained within a circle of radius r drawn about its centre is given by 


M(r) = ἯΙ tp(t) dt. 
0 


Then the essentially non-negative function p(r) is called the area density 
distribution of the matter in the disk, and by the first fundamental theorem of 
calculus it follows that 2arp(r) = M’(r). 

Use the form of argument outlined in the previous problem to prove that 
the moment of inertia J of the disk about an axis perpendicular to its plane 
and passing through its centre is given by 


[= 2: γϑρ(γ) dr. 
0 


If the disk is of mass M and has a uniform area density p(r) = po, deduce the 
relationship between po and M and use it to prove that the moment of inertia 
of the disk about an axis perpendicular to its plane and passing through its 
centre is 


Ma? 
eo 
7.37 Indicate by means of simple examples how the integral inequality (7-1) may 


be used to place upper and lower bounds on the integrals defining the area A 
and the moment of inertia J in Problems 7:34 to 7:36. 


Systematic integration 


8.1 Integration of elementary functions 


The main objective of this chapter is to explore some of the systematic 
methods for determining an antiderivative, that is, a function F(x) whose 
derivative is equal to some given function f(x). As described in the previous 
chapter, we shall denote the antiderivative of the function f by {f(x)dx with 
the understanding that 


(f(@dx = F(x) + C (8-1) 


with C an arbitrary constant. | 
Alternatively, as any indefinite integral of f must also be an antiderivative 


t 
of f, we may identify F(x) in Egn (8-1) with | f()dt where-a is arbitrary, to 


obtain the equivalent expression 


| fxdx = | | f(ddt + C. (8-2) 


Remember that the symbol f /(x)dx for the antiderivative of f derives from 
differentiation and denotes the most general function whose derivative is ἢ 


b 
The allied symbol | f(x)dx, denoting a definite integral of f, derives from 


integration and is simply a real number. Considering the definition of an 
antiderivative, we shall say that two antiderivatives are equal if they only 
differ by a constant. 

It should be recalled that the connection between the concepts of an 
antiderivative and a definite integral is provided by the fundamental theorem 
of calculus, which asserts that 


[oss from {fo 


In view of Eqn (8.1) this may be written © 


t= 


| Fonas = F(b) — F(a). (8-3) 


Very often in texts the term indefinite integral is loosely ascribed to the 
entire right-hand side of Eqn (8-2) instead of, as here, only to its first term. 
This is usually justified by the fact that a is arbitrary though, of course, it 
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does not necessarily follow that all possible constants C can be absorbed into 
the integral by a suitable choice of a. For example, we have the antiderivative 


f cos xdx = sinx + C, 


though if for some particular problem it was appropriate to set C = 3, say, 


then no choice of the arbitrary constant a would enable us to equate 
x 
cos xdx and sin x + 3, for this would imply that sina = —3. 
a 
Unfortunately, the theorems for the differentiation of wide classes of 


functions seldom have any counterpart for determining antiderivatives. 
Ultimately, success in finding an antiderivative depends on whether or not 
the function fcan be so simplified that one may be recognized by using tables 
of derivatives in reverse: that is, matching the desired derivative f with one 
in the table, and reading backwards to deduce an antiderivative. Thus, to 
find the antiderivative of 3 sec x tan x, we first glean from Table 5-1 that 


d 
— (sec x) = sec x tan x 
dx 


or, equivalently, 
d 
— (3 sec x) = 3 sec x tan x 
dx 

showing that the antiderivative is 


[3 sec x tanx dx = 3secx+C. 


In colloquial terms, the process of finding the most general antiderivative of 
the function f(x) is called the ‘integration of f(x)’. 


Table 8-1 gives a preliminary working list of important integrals which 
has been compiled from the tables of derivatives in Chapters 5 and 6. 


The two separate results shown against number 3 are usually contracted to 
dx 
| — = log |x| + 


with the tacit understanding that the arbitrary constant C differs according 
as x iS positive or negative. With obvious modifications, this convention will 
be extended to include all integrals involving the logarithmic function. 
Specific examples involving this convention are to be found in Problems 
8- 1-8-3. 

The following statement is equivalent to both Eqn (8-1) and Eqn (8-2), 
and it arises as a direct consequence of the definition of an antiderivative. 
We formulate it as a general theorem. 


rr i a A i i ὃθ 2ζ 25ὌὯὲεαεα͵α..... ϑ.,. ϑ ϑφ τ Γ ΓᾺΛ:Ὴ᾽,Γ[ιΔἋἍΖΨ[ὁΓἔΨἘὁἐἔἐΠΨΦἜοιΨΠιΠὁιἨὀιεοὥ ὔΘ τ τ1)ι᾽έὲ͵͵͵ ΝΨῸῬ 
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Table 81 Basic table for integrals 


——, 
“» 
3 
Ι 
a 
= 
ἢ 
2) 
ἔσταν 
Ἢ 
| 


rdx= 
ae ax | 


Ἐς (a > 0); 
a 


3 dx logx + € 
ἶ x log(—x) + C for x <0; 


1 
4, | e@@dx =-et%40C (a # 0); 
a 
1. 
5. | cosax dx =-sinax +C (a - 0); 
a 
1 | 
6. | sinaxdx = ~ Zeosax + € (a #0); 
a 
dx a. - 
7. | ange warn’ το for Ea ΡΠ ῊΓ 
dx 1 x 
8, > =-arctan~+C (a #0); 
a+x oa a 
9 ss inh— + (a £0 
: ---------- = arcsinh -- : 
(a? fxd) Η a ); 
x 
. arccosh — + C for x >a, 
dx a 
x2 — @ - 
νὰ — arccosh (=) AAG for x < --α; 
. a > 
dx 1 x 
ut | εἰ > = —arctanh- + C ἴογ [χ] - [4]; 
α3--χ a a 
dx 1 
2. | - ᾿ - = — —arccoth= + C for |x|>J|a\. 
x2 — αϑ a a 


NE SS tile ES 


THEOREM 8:1 
d 
an § fx)dx = f(x). 


In words, this general result merely asserts the obvious fact that the 
derivative of the antiderivative of a function f(x) is the function f(x) itself. 
Its most frequent application is probably to the verification of antiderivatives. 
For example, let us use the theorem to verify the antiderivative 
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= arcsin (£) -ς (Α) 


? 


gdx 
Vv (a? — 59 
where g = g(x) is some differentiable function of x and | g| < a. 
By Theorem 8-1 we must have 


i 


billy (feet Segre 
dx J V(a— ε V(a? — g?) 
Now, differentiating the right-hand side of (A) we find 


d . é) ] g 

— | arcsin [= ἔ [᾿Ξ SS Ὁ 

dx ᾿ FoI Ila) Ὁ 

ae ae, 
ν(α -- 55) 
which is identical with (Β). Thus, (A) is verified. 

A final general result of great value is the fact that the derivative of a 
linear combination of functions is equal to the same linear combination of 
their derivatives (Theorem 5-4). Expressed in terms of antiderivatives this 
implies the following general theorem. 


(B) 


THEOREM 8:2 
f (kif + keg)dx = kiffdx + kelgdx. 


It is, of course, this theorem that permits us to simplify many expressions 
to the point at which antiderivatives may be deduced from tables of standard 
integrals (antiderivatives) such as Table 8-1. Hence we have 


f (5x? — 2 cos x)dx = 5fx?dx — 2[ cos xdx 


5x3 
== -2sinx + C, 


The separate arbitrary constants associated with each of the antiderivatives 
on the right-hand side have, of course, been combined into the single arbitrary 
constant C. 

The remaining sections of this chapter are concerned with outlining the 
details of the main techniques available for finding antiderivatives. 


8:2 Integration by substitution 


Possibly the most frequently used technique of integration is that in which 
the variable under the integral sign is changed in a manner which simplifies 
the task of finding the antiderivative. This process is known as integration by 
substitution or integration by change of variable. It is in this technique that 
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the full significance of the symbol dx in Eqn (8-1) is first realized. Indeed, by 
making a straightforward application of the chain rule for differentiation 
(Theorem 5:7) we shall arrive at a simple mechanical rule for effecting a 
variable change by using differentials. 

Because composite functions (functions of a function) of x often occur 
under the integral sign we shall consider a general antiderivative of the form 


1 = Sk(x). flg@ldx. 


In order to cover all likely cases we shall consider the effect on 7 of chang- 
ing the variable x to the variable u, where x and u are related by 


g(x) = Ay), (8-4) 


with f, g¢ differentiable functions. 
Let us start by supposing that 


I = fk(x) . f[g()]dx = F(x) + C, (8:5) 


so that we know 


dF 
<= k(x) flg@] (86) 


Applying the chain rule to F(x) gives 
dF(x) dF dx 


ἂν ἀχ du 
which, by virtue of Eqn (8.6), may be written 
dF (x) dx 
= k(x). )}. πο: 
du ΠΡ] du 
On the assumption that Eqn (8-4) may be solved for x in the form 
x = g[A(w)] (8-7) 
we arrive at the result 
dF(x) _ 
du 


Now by implicit differentiation (Corollary 5-19 (a)) of Eqn (8-4), it follows 
that provided g(x) τέ 0, 


; 
kg Ἡβ(] ΠΛ] τ. (88) 


dx πω 
ἀμ͵ μ΄ ΟἹ 
so that 


ΔΕ) _ Aig h@pATA@ A) 


du g {g Aw) (59) 
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However, Eqn (8-9) simply asserts that F(x) is an indefinite integral of 


ig h(u)]aCu))h (uw), 
eg Taw) 


and thus taking the antiderivative yields 


Ϊ kig MAW i flA@ lh (0) 


du = F ct 
gig ty TOE sis 


where C* is an arbitrary constant. 

Comparing Eqns (8:5) and (8-10), and on the understanding that two 
antiderivatives are equal if they only differ by a constant, we have thus proved 
that 


kfo-1h(u) }f ΜΚ Ja 
| k(x)  fle(Q\dx = | see τ ἀμ (8.11) 


This forms the result of the following theorem: 
THEOREM 8-3 (integration by substitution) If g, ἢ are differentiable func- 
tions and g(x) = A(u), with g(x) τέ 0 and x = g~[A(u)], then 
kg Tw Maw) 
k(x). flg()]dx = | de. 
JA ft ΓΟ 


Two special cases occur when (a) k(x) = 1 and g(x) = x, so that g’(x) 
= |, and (b) k(x) =: 1 and A(u) = μ, so that h’(u) = 1. These are stated as 
Corollaries 8-3 (a, b) below, which are the results most often to be found in 
textbooks. 


Corollary 8-3(a) If x = h(u) is a differentiable function of u, then 


{ f@)dx = f fTh(wh'(u)du. 
In terms of the differential relationship dh = h’(u)du this is also capable of 
expression in the form 


f foddx = f f(Adh. 


Corollary 8-3 (Ὁ) If g(x) = wu is a differentiable function of x, with 
g(x) ~ 9, then 


| Mlecods = | fw) (=) ay 
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or, as dx/du = 1/g’[g—(u)], 


3 fw) 
Te(x)Jdx = | ———— du. 
| ure 518. *()] 
All of these results may be conveniently summarized in the form of a single 
simple mechanical rule for changing the variable in an antiderivative. 


Rule 1 (Integration by substitution) 
We suppose that in the antiderivative 


I= J k(x). fle(~)]dx 


it is required to change from the variable x to the variable uw by means of the 
relationship g(x) = A(u), where g and ἢ are differentiable functions, with 
g'(x) # 0. The result may be deduced from J above by: 


(a) replacing g(x) in f[g(x)] by A(u); 

(b) solving g(x) = A(u) for x in the form x = g~![h(u)] and then replacing 
x in k(x) by this result; 

(c) replacing dx by du, where du is obtained from the differential rela- 
tionship g‘(x)dx = h'(u)du; 

(4) replacing x in g'(x) by x = g—[h(u)]. 


We now illustrate the application of this rule in a series of examples. 
Unfortunately, although the rule tells us how to change the variable, it offers 
us no information on the type of variable change that should be made. That 
is to say it does not tell us the functional form of fand g. Only experience 
can help here. 


Example 8.1 Evaluate the antiderivative 
[= J x8V/(1 + x2)dx. 


Solution This antiderivative is of the most general type contained in 
Theorem 8-3. First we make the obvious identification k(x) = x? and then, 
to remove the square root function which is difficult to manipulate, we shall 
try setting 


1+ x? = y?, 


That is to say, in the hope that it will lead to a simpler expression, we make 
the further identifications 


g(x) = 1 +4 x? and h(u) = u?, 


The function fin Theorem 8-3 then becomes the square root function, with 

ν( + x?) = u. Rather than solving for x, for the moment we shall use the 

result x3 = x. x2 = x(u? — 1), when we find x34/(1 + x2) = xu(u2 — 1). 
Now g(x) = 2x and h’(u) = 2u, so that the differential relation g'(x)dx 
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= h'(u)du gives rise to xdx = udu. Hence, in differential form, 
χῦ 2} (1 + χϑ)άχ = u(u? —1)xdx = μξ(μξ — 1)du, 
and so by the rule derived from Theorem 8:3, 


T= f x8\/(1 + χϑ)άχ = f u(u2 — 1)du. 


The antiderivative on the right-hand side is now straightforward and may be 
integrated on sight to give 


u> 3 
| eee δ 
5 3 
Or, 
1 2)5/2 2)3/2 
pe Oe ee 


5 3 


Example 8:2 Evaluate the antiderivative 
T= [ν{( + x?)dx. 


Solution In this antiderivative k(x) = 1, but it is not immediately clear how 
best to change the variable. It is left to the reader to see why neither of the 
possible substitutions u? = 1 + x? or u = 1 + x? bring about any effective 
simplification. Instead, let us seek to remove the square root by making the 
substitution x = sinh u, so that the problem becomes analogous to Corollary 
8-3 (a). Then 1 + x? = 1 + sinh? u = cosh? w, so that +/(1 + x2) = cosh uw. 
Next, as g(x)=x and A(u) = sinhu, g(x) =1, h’(u) =coshy and so 
dx = cosh udu. Applying the rule then gives 


J/(1 + x?)dx = cosh uw. cosh udu = cosh? udu, 
whence 
I = § cosh? udu. 
Now use the identity cosh? u = 4(cosh 2u + 1) to give 
= if(cosh 2u + I)du 
u 
2 


To return to the variable x it is necessary to use the results u = arcsinh x, 
coshu = «/(1 + x?) together with the identity sinh 2u = 2 sinhu.cosh yu 
to obtain | 


T= {{[χν 72 ὶ.3 + x?) + arcsinh x] + C. 


=tsinhu+=~+C. 
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Example 8.3 Evaluate the antiderivative 

I= fcos(1 + 3x)dx. 
Solution This antiderivative has k(x) = 1, and by setting 1 + 3x =u so 
that g(x) = 1 + 3x, A(u) = uw it reduces to the situation of Corollary 8-3 (b). 
Applying the rule we find that cos (1 + 3x) = cos μ and 3dx = du, whence 

I'=  f4cosudu 

= ἢ sin u + C. 

and thus 


T=4sin(1+3x)+C. 


Example 8:4 Evaluate the antiderivative 


T= [2χν ([ + x*)dx. 


Solution Setting u = 1 + x? it follows that du = 2xdx, so that 
2xV(l + x*)dx = +/udu, 
whence 
T= f Vudu = 3u3/24C 
= + 2 Ἐ Ὁ. 


It is interesting to notice that when the situation found in Example 8-4 is 
expressed in terms of Theorem 8-3 by making the identification k(x) = g’(x) 
and then setting u = g(x) it gives rise to the general result 


J 2x) .flg@)]dx = f f(u)du. (8.12) 


This is not, of course, a new result since it is no more than the statement 
of Corollary 8-3 (a) with the roles of x and u interchanged. 

It is an immediate consequence of Eqn (8-3) that Theorem 8-3, together 
with its corollaries, also applies to definite integrals provided that the limits 
are also transformed by the same transformation law. The restatement of 
Theorem 8-3 in terms of definite integrals is as follows: 


THEOREM 8-4 (integration of definite integrals by substitution) If g, A are 
differentiable functions and g(x) = h(u), with g’(x) 4 0 and x = g7![h(u)], 
u = h-[g(x)], then 


; Ug) κέ σ 1h) fT ACa) 1h’ 
(i) leconex = [Ὁ "0" Me neon stacy 


du. 
h-l[g(a)] gig "[A(u)}} ᾿ 


One specially simple case of this theorem merits recording in the form of a 
corollary. It is the result corresponding to Eqn (8-12) and is obtained by 


—__——- 
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making the identifications k(x) = σ΄), u = g(x). 
Corollary 8-4 If u = g(x) is a differentiable function, then 


b 9(b) 
Sle(x)].g'@)dx = [ Sada, 


When expressed in the form of a mechanical rule, Theorem 8-4 is as 
straightforward to apply as was our previous rule. 


Rule 2 (Integrating definite integrals by substitution) 
We suppose that in the definite integral 


b 
r= | k(x). fle@ldx 


it is required to change from the variable x to the variable u by means of the 
relationship g(x) = πίη), where g and ἢ are differentiable functions, with 
g (x) #0. The result may be deduced from J above by: 


(a) transforming the differential expression k(x) . f[g(x)]dx as indicated 
in Rule 1; 

(b) solving g(x) = h(u) for u in the form u = A-4[g(x)] and replacing 
the upper limit ὁ by 4-1[g(6)] and the lower limit a by A-![g(a)]. 


Example 8.5 Evaluate the definite integral 
1 
I= | x?a/(1 — x?)dx. 
0 


Solution Let us make the substitution x = sinu, so that dx = cos udu, 
when 
x24/ — χϑ)γάχ = sin? uw. cos u.cos udu 
= sin? μ. cos? udu. 
Then, as u = arcsin x, using the principal branch of the sine function, we 
find from Rule 2 that 


1 ἈΓΟΒΙῊ 1 
i= Ϊ χΈΜ -- x*)dx = Ϊ sin? u . cos? udu 
0 


arcsin 0 
ἐπ 
= | sin? μ΄. cos? udu. 
0 


To evaluate this last definite integral we use a technique from Chapter 6 
which is often helpful. From Definition 6-6 we may write 


ett ae οἵω 2 etu + =) 
2ὶ 2 


sin? u.cos?2 u = ( 
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- [a ( 4 ) 
ediu +- e~4iu — 2 
—16 


2 


and thus 
sin? u.cos*u = {{|{ — cos 4u). 
Using this result in the definite integral, which may then be evaluated on 


sight, we finally obtain 


ἐπ 
ΤῊΞΞ 1} (1 — cos 4u)du 
υ 


ΞΞ 1 


= ὁ [u — (4 sin 4u)] = 67, 


ἐπ 
ῃ) 


and so 
l 
| xe4/(1 — x2)dv = ἀπ. 
0 


Example 8:6 Evaluate the definite integral 
1 
{= { (2x + 5) cosh (x* + 5x + I)dx. 
0 


Solution Inspection shows that this example is of the form of Corollary 8-4, 
with the function f == cosh and g(x) = x2 + 5x + 1. 
As g{0) = 1, g(1) = 7, by setting uw = g(x) we at once obtain 


ἢ ἘΞ | cosh udu = (sinh 7 — sinh 1). 
L 


8.3 Integration by parts 


This most valuable technique is based on Theorem 5:5, concerning the 
derivative of the product of two functions. That theorem asserts that if f, g 
are two differentiable functions of x, then 


d a 
7 UO 900] = ΠΧ) 8°91 + 9) gO]. 
Taking the antiderivative of this result gives 


fr) 50) = ffl) g'Cddx + f goo f'Codx 


which, on rearrangement, becomes 


(£09 5 Ολάχ = Κ g(x) — [5 ) fide. (8.13) 
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This is one form of the required result. Using the differential notation 
df = f'(x)dx, dg = g’(x)dx enables this to be contracted to the equivalent 
and casily remembered alternative form 


Jfdg =fe—Jedf. (8-14) 


These results are now formulated as our next theorem: 


THEOREM 8°5 (integration by parts) If /, g are differentiable functions of x, 
then 


Ff) gwd = lx) gO) — [σοῦ feds 


or, expressed in differential notation, 
Jfdg =fe —Sgdf 


This useful theorem is the nearest possible approach to a general theorem 
for finding the antiderivative of the product of two functions. It depends on 
the fact that often the antiderivative { ¢ df is easier to determine than the 
antiderivative { fdg. Naturally, the technique of integration by substitution 
can also be employed when evaluating { ¢ df- 

When definite integrals are involved it is not difficult to see that the result 
is still valid provided the limits are also applied to the product fg. The a 
result is as follows: 


THEOREM 8-6 (integration by parts: definite integral) If f, g are differenti- 
able functions of x in [a, 6], then 


b 
} g(x) f'(x)dx 


b b 
[ (OO ener Oe 


b 
= (γώ s(6)) -- tfla) g(a) -- | gd fdas. 


As before, we illustrate both of these theorems by means of a series of 
examples. These have been carefully chosen to demonstrate a variety of 
situations in which integration by parts is useful. 


Example 8:7 Evaluate the antiderivative 


I= fx*logxdx forx >O,k τ --|. 


Solution The problem here, as with all applications of the technique of 
integration by parts, is to decide upon the functions f and g. A little experi- 
mentation will soon convince the reader that J will only simplify if we set 
f(x) = log x and g(x) = x*-1/(k 4+ 1), for then g(x) = x* and f’(x) = I/x. 
Accordingly we write 7 in the form 
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kth 
1 = [Ἰορ χὰ Ε rt | 


Applying Theorem 8:5 gives 


ΚΙ ΚΙ x 
Ν kt log x ‘ k+l 4 C 


k+l (k+1) 
Example 8-8 Evaluate the definite integral 


1/2 
| arcsin x dx, 
0 


Solution This time we make the identifications f(x) = arcsin x and g(x) = x 
and write 


Lie x dx 
arcsin x d[x] = x arcsin τ a (A) 
| " va -- x) 
We have 
1/2 | 
xaresinx| =7/12 —O = 27/12 
0 


but the definite integral on the right-hand side is still not recognizable, To 


simplify it let us now set u = 1 — x? so that x dx = —} du; using Theorem 
8-4 we obtain 
[-Ξ-- = [Ξ ΠΥ ase 
ο νἱ -- x?) 1 Vu 1 2 
Combining this result with (A) gives 


1/2 2 
aresin x dx = n/l2 + ¥* — l. 
0 


Ld 


Example 8.9 Evaluate the antiderivative 
IT = [ἐδ sin bx dx. 


Solution This time we choose to make the identification f(x) = sin bx, 
2(x) = (1/a)e% and to write J in the form 


re [ sin bx d (- or). 
a 


Integrating by parts we find 


I b 
βρῶ sin bx dx = 7 e@% sin bx — - | e2% cos bx dx. 
a 
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Now let us use this same device on the second term above to obtain 


I 
| ΘΖ sin bx dx = — e% sin bx — 2 [ cos χα (- ear] 
a a a 


l b b2 
= -- εὐ sin bx — — e®* cos bx — — | βρῶ sin bx dx + C. 
a a2 a? 


Combining terms gives 


b2 aL ' ἘΞ 
(1 +5) { err sin bx dx = eZ (a sin bx — bcos bx) we 


a2 
and so 
οὔ (a sin bx — bcos bx) 
αὐ ah heat = .---......-. Ὁ Ὁ Ὁ ς΄ C* 
[- Sin DX Gx P+ be Ἔ 


where ΟὟ is related to C by C* = "(κα + 5). In fact there is no necessity 
to distinguish between C and C*, since as C was an arbitrary constant of 
integration, C* is also an arbitrary constant. For this reason it is not 
customary to redefine arbitrary constants when, as above, they are simply 
multiplied by a constant factor. 


8:4 Reduction formulae 


It not infrequently happens that an antiderivative J involving a parameter m 
may be reduced by means of the technique of integration by parts to an 
expression in which the parameter has a value differing by an integer k from 
its original value. If we denote such an antiderivative by Jm, then a typical 
situation is the one in which we arrive at an expression of the form 


Im = A(m) + Im-, (8-15) 


where A(m) is some known function. 

Expressions of this form provide an algorithm for the computation of any 
antiderivative of the given type once one of them is known, for the Jm are 
then defined recursively by this relation in terms of J), say. It is customary to 
refer to expressions of the general form of Eqn (8-15) as reduction formulae. 
The same idea is equally applicable, without essential modification, to definite 
integrals. 


Example 8-10 Determine the reduction formula for 
Im = [ cos™ 6 dd, 


Use the result to determine 77. 


Solution We rewrite Im as follows and use integration by parts. 
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m = J cos™-1 6 d(sin 8) 

= cos™-19. sin 6 — f sin 8 .(m — 1) cos™~-2 6(—sin θ)40 

= cos™-1 9. sin 8 + (m — 1) f cos™-2 6. sin? 6 40 

= cos”-1 9. sin 6 + (m — 1) f cos™-2 0(1 — cos? θ)40 

= cos”-1 4. sin 6 + (m — 1) f cos™-? 6 4θ — (m — 1) [ cos™ 6 dé. 
Recalling the definition of Jm we discover that this may be re-expressed in 
terms of [nm and Im-2 as 

Im = cos™-1 6. sin θ + (m — 1)Im-2 — (m — In, 

whence we arrive at the required reduction formula 


cos@™-14.sin@ -/m— 1 
= ----.-.-.ὄς----- + Im-2. 


m= 
m 


Setting m = 7 gives 


_ cos’ @. sin 6 6, 

7 ΞΞ 7 7 18 

= os (Se ἐλ) 

" 7 7 5 5° 
cos’ 6. sin 8 6 


sO a εν cn δεῖ π- i). 
7 Fags: Va SEE as 3 +34 


As i, = f cos 6 dé = sin 6 + C this gives the result 


24 (Ξ θ. 06. 2 ) 


] 6 
[057949 = = cost 8. κἷπθ + = cost θ. sin θ + = cos? βίη 9 


16, 
+ τε sin 8 + C. 


Example 8:11 Evaluate the definite integral 


ἀπ 
οἵη = | cos™ 6 dé 
0 


ἀπ 
and deduce its relationship to | sin™ 6 dé. 
0 


Solution We can make use of the reduction formula determined in the 
previous example. It follows from 


In. = 


cos”-1 9, sin 6 m— 1 
= ee + ( Im-2 


that the definite integral Jm obeys the reduction formula 
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ἐπ m— 1 m— 1 
+ ( ) Jn-s = [ ) Jn-2 
0 m m 


We must now consider separately even and odd values of m. Firstly, if m is 
even, 50 that we may write m = 2n, then 
2n—1 2η -- 3 ᾿ 


mn’ 2 


cos™-1 @ . sin 8 
poe ον ΝΟΣ, ἀτιχειενς 


m 


Jen - 


Secondly, if m is odd, so that we may write m = 2n + 1, then 


2n 2n — 2 25 
i Dyes a 


Jon+1 = 


ἐπ ἐπ 
So, using the fact that Jo = [ 1 dé = 47 and J} = [ cos 8d0 = 1, we 
0 


0 


obtain: 
1.3.5. «(2 - 1), 
La OY SR a 
7 = 2.4.6...2n 
cle 11} 


Finally let us prove that 
ἐπ ἐπ 
Jn = { cos™ x dx = { sin™ x dx. 
0 0 
To achieve this make the variable change x = ἐπ — u in J» to obtain 


ἐπ 0 ἐπ 
[ cos™ x dx = — [ cos™ (ἐπ — u)du = | cos™ (ἐπ — u)du 
0 0 


«(ἐπ 


ἐπ 
= [ sin™ u du. 
0 


This last result is of some interest historically, as it provided the first 
infinite product representation for 7. One form of the argument used to derive 
this result proceeds as follows. 

It is readily seen from the expressions for Jon and Jenii that 


2.4.6...2n 13 1 Jon 
Ξ ieee λύσις τς ἘΕενΣ τος τη ; 8.16 
in = ..(2n — 5 | 2n + 1.95.1 ( ) 


Now in the interval (0, $77) the following inequalities hold: 


sin2”-1 χ > sin?” x > sin2"t+l x > 0, 


so that as 
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ἐπ΄ 
Im = | sin™ x dx, 
0 


it follows at once that 
Jan~1 => Jon > Janz. 


This 15 equivalent to 


Jon-1 _ Jan 

Jon+1 a Jens a Co 
but as 

Jon-1 2n+1 

Joust η΄ 


we must have 


ae. | (8-18) 
N—> OO Jon41 
By virtue of Eqns (8-17) and (8-18) it also follows that 
᾿ Jon 
lim =]. 
n—> οὉ Jont1 


So, taking the limit of Eqn (8-16) as n — oo, we arrive at the expression 


1 lim [155 τὴς 25 — 2 2n 2n ) 
T= 


n—> 


19 
1.3.5. 5. ὃ 2η -- [1 2n—1 2n+1 or) 


This famous result, called an infinite product, was first obtained by the 16th- 
century mathematician John Wallis. If δ. denotes the nth partial product 


224 4 2n—2 2n 2n 
133 5 n—1 2n—1 2.1 


then the limit in Eqn (8-19) is to be interpreted to mean that | ἐπ — Sp |—>0 
as n— οὐ. 

Reduction formulae may involve more than one parameter, as the final 
example illustrates. 
Example 8:12 Show that 

In,n = J sin™ x cos” x dx 
satisfies the reduction formula 


(m + n)Im,n = —sin®™! x. cos*+] x + (m — 1)Im—2,n. 


Solution Write Im,n in the form shown below and integrate by parts. 
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Inn = f sin™-1 x . cos” x d(— cos x) 
= — sin™-1 x. cos"*! x — f (— cos x)[(m — 1) sin™-2 x. cos**1 x 
—nsin™ x. cos®—} x]dx 
= — sin™-l x, cos®tl x - (m — l)In-2.n+2 — nnn. 
Next reduce Jn—2,n+2 to a simpler form by writing 
Tin-2,n+2 = J sin™-2 χο ρος x dx = f sin™-2 x. cos" x(1 — sin? x)dx 
which shows that 
Im-2,n+2 ἘΞ In-2,n a In,n- 
Using this to eliminate Jm—2,n+2 from the previous result gives 
Inn = — sin™-1 x. cos®*l x + ( — 1)Im-2,n — (m — IImjn — nian 
or, 


(m - n)Inn = — sin™ 1x. cos®tl x + (m — 1) μ-- π. 


8.5 Integration of rational functions—partial fractions 


It will be recalled from Chapter 2 that a rational fraction is a quotient 
N(x)/ D(x), in which N(x) and D(x) are polynomials. Antiderivatives of 
rational fractions are often required and in this section we indicate ways of 
expressing the fractions as the sum of simpler expressions, the antiderivatives 
of which are either known or may be found by standard methods. Our 
approach to the general problem of finding the antiderivative 


will be to first consider some important special cases. 


Case (a) Suppose that N(x) is of degree 0 and D(x) is a polynomial of 
degree 1 and write 


N N(x) I 
Dix) cx+d 
Then, making the substitution wu = cx + d, we find 
— = =| ς 
[Ξ ee a {> og |u| + 
and so 


dx I 
[-Ξ3 2 Ξ:ιοεια τά τὰ 
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A similar argument establishes that 
--Ἱ I 


ΞΕ ΕΞ ΘΝ ς 
| (ον τ Ἦ ὦ} cn—1) (cx+d)rt as 


Case (b) Suppose N(x) is of degree 0 and D(x) is of degree 2 and write 
N N(x) Ι | 
D(x) ax? + bx +e 


Then completing the square in the denominator D(x) gives 


φρο τόρ ἐγ ἐξ τ 8} -ἴρ 2 1 
ax xX c=a4 x =) τ 4a? =a a o |: 


where ἃ = (c/a) — (5445) may be positive, negative, or zero. Making the 
variable change u = x + (6/2a) then shows that 


I= | dx _ i du 
J atthe te as ttre 
This is a standard integral which may be identified from Table 8-1 once the 


sign of « has been determined. [t will involve either the function arctan or the 
function arctanh. 


Case (c) Suppose N(x) is of degree 1 and D(x) is of degree 2 and write 
N(x) _ PX] 
D(x) ~~ axt+ bx +e 
Then we can write 
Ι- Ϊ PX+G τκ- { (pi2eXeax + δ) + Ig — (pb/2a)] τι 
ax? + bx +e ax? ++ bx +c 
from which we find 
:-.. 2αχ - ὃ ar «(55:8 }} ἀχ ᾿ 
24} αχϑ - ῥχ- ς 2a ax*-+ bx +e 


The second antiderivative is the one discussed in (b) above, and by setting 
u = ax? + bx + ¢, the first antiderivative reduces to 


] 2ax + ὃ 


du 
ant τ ar eer Bae = — ἘΞ a: 9 
axe + bx be [Ξ log |u| + C = log|ax? + bx - 6{-Ὲ Ὁ. 


Combining this result with that of Case (b) then leads to the desired anti- 
derivative J. 


Case (d) Suppose N(x) is of degree 1 and D(x) is a quadratic raised to the 
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power n > 1 and write 


N(x) _ px +q 
D(x) (ax? + bx + cyt 


Then, using the identity 


b 
pxr+q= [2] ax +6) + (..- 5) 


enables us to write 


[os PX eq x= (2 P | oe 2ax + ὃ an 
(ax? + bx + Aree 2a (ax? + bx +c)" + bx + c) 


τ} area 
ig =e) |) oe 

2a (ax? + bx + c)” 
Setting u = ax* + bx +c in the first antiderivative on the right-hand side 
then leads to 


Zax +b [Ξ (=) ] 
dx = = Bie δ, 


(ax? + bx +c)” un n—1/ unl 


— | l 
" (- -- | (ax? + bx + cj} ἐξ, 
The second antiderivative on the right-hand side must be evaluated by means 
of a reduction formula. 
In the case n = 1 we have the obvious result 


2ax + ὁ 


Saige ee eee 


Having considered a number of special cases we must now examine how 
we should proceed when D(x) is any polynomial with real coefficients, and 
the degree of the polynomial N(x) is less than that of D(x). The coefficient 
ao of the highest power of x in D(x) will be assumed to be unity, since if this 
is not the case it can always be made so by division of N(x) and D(x) by ao. 
Now we know from Corollary 4:1 (b) that D(x) may be factorized into real 
factors of the form 


D(x) = (x — a)F(x — Bb)’. . : (x2 + px + q)™, (8-22) 


where x = a, b,. . ., are real roots with multiplicities k, /,. . ., and (x? + 
px + q)™ represents an m-fold repeated pair of complex conjugate roots. 
Then from elementary algebraic considerations it may be shown that when 
the degree of N(x) is less than that of D(x) we may always set 
N N(x) _ Ay Ag Ak By 


DO Go) Gam “Gear” @=p 


+ 
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Bo B, Pix + οι 
Co Gea ΟΞ + px +4) 
Pox + Qe Pmx + Qm 
a ποτ ενς- --ς. (8:23) 
(x? + px + q) (x? + px + 4) 


That is to say, every rational fraction may be expressed as a sum of simple 
fractions of the types whose antiderivatives were obtained in Cases (a) to (d). 

The expression on the right-hand side of Eqn (8-23) is called a partial 
fraction expansion of the rational fraction N(x)/D(x) and the coefficients 
Ai, Ao, . . ., Pm, Om are called undetermined coefficients. The undetermined 
coefficients may be found by cross-multiplication of this expression, followed 
by equating the coefficients of equal powers of x. Antiderivatives of rational 
fractions N(x)/D(x) may thus be found by a combination of the method of 
partial fractions and the results of Cases (a) to (d). 

If the degree of N(x) exceeds that of D(x) by n, then the situation may be 
reduced to the one just described by simply adding to the partial fraction 
expansion (8-23) the extra terms 


Ro + Rix + Rox? - τ. + Ryx". 


This result can also be achieved by first dividing N(x) by D(x). The circum- 
stances usually dictate which approach is the easier. 


Example 8.13 Evaluate 


x® + Sx? + 9x + 5 
l= dx. 
x2 43x41 


Solution Here, as the degree of N(x) only exceeds that of D(x) by one, we 
shall start by dividing the integrand to get 


fe Oe ONT 2a, ip 2x +3 
x2 + 3x+1 " x2 + 3x4+ 1 
when | 
2x +3 
= 2 sn 
[a+ ix + | Go 


The first antiderivative is trivial, whilst the second is of the form discussed in 
Case (d), so that 


2 
=F + 2x tlog|x?+ 3x41) 4+. 


Example 8-14 Evaluate 


r= | x dx 
(x + 2)%(x — 1) 
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Solution In this case we must adopt the partial fraction expansion 
x A B ς 
@+ Qe) eto ΘΈΣΙ χαὶ 
Cross-multiplication gives 
x = A(x + 2γὰ — 1) + Box — 1) + C(x 4+ 2)? 
or 
x = A(x? + x — 2) + Βα — 1) + CO? + 4x 4+ 4. 
Equating coefficients of equal powers of x gives: 
Coefficient of x2: O=A+C 
Coefficient of x: 1=A+B+4C 
Coefficient of x®°: 0 = —24 — B+ 4C, 
showing that A = —1/9, B = 2/3, and C = 1/9. We may thus write 


x dx __l 1 dx 
(x + 2)%x —1) sist 5 [πτ τ 5} 


These antiderivatives were all discussed in Case (a), so that using those results 
we obtain 


I 


2 re 
3G Gra? glog|x—1] 4+. 


I 
[= —jlog|x+2|— 


Example 8.15 Find the antiderivative 


x4 — x8 + 5x2 4x43 ᾿ 
(x + 1)(x? — x + 1)? 


Solution Here N(x) = x4 — x8 + 5x2?+%+3 and D(x) = (x 4+ I(x? -- 
x + 1)2, so that the degree of N(x) is 4 and the degree of D(x) is 5. Following 
on from our earlier reasoning we must set 

ee Oe eS A Bx tC Dx +E 

(x + 1)\(x? — x + 1) ~x+1 x?—x4t1 (x?-—x4+1) 

Cross-multiplication gives the identity 

χὰ — x8 4 5x2? +2743 = A(x? ~ x + 1) 

+ (Bx + Cx + ) 03 — x + 1) + (Dx + EV + 1). 

Instead of expanding the right-hand side and then equating coefficients of 
equal powers of x as in the previous example, we shall use the fact that 


(x + 1) is a factor of D(x) to simplify this expression. Setting x = —1 we 
find that 9 = 9A, or A = 1 and so 
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x4 — x8 4 5x2 +x - 3 = (xX? ~x - (1)? 4+ (Bx + C1? + 1) 
+ (Dx + E)(x + 1), 

whence 

x8 + 2x2 + 3x +2 = (Bx + CMP + 1) 4+ (Dx + LEX 4+ 1). 
Having eliminated A we now proceed as before and equate coefficients of 
equal powers of x to find B, C, D, and E: 

Coefficient of x#: 0Ξ 8 

Coefficient of x8: ἱξΞξΞ 

Coefficient of x?: 2 = ἢ 

Coefficient of x: 3=B+E+D 

Coefficient of x8: 2=C+E. 


Thus, B=0,C =1, D=2, E=1 and so 


=f dx 6 dx 1 2x + 1 d P ᾿ : 
7 x?—x+1 (x2 — x + 1) ie ΣΌΣ 


Now 


a= [- x = ] 1 C 
1 ΞΞ 14 og|x+1(4+ C1 


an 


d 
b= | ee τ᾿ 
G@ bh? ἘΞ = —— arctan (= V3 + Co. 


To evaluate 15 write 


= | 2x — 1 ἀν + [ 2dx 
oa (x? — x + 1)? (x2 — x + 1)2 


ὩΣ Er | 
Q@—x4+1) (x — #2 + /3/2)2]2 


Next, setting x — 4 = (\/3/2) tan 6, so that dx = (4/3/2) sec? 6 dé, gives 


ieeentiestece 2 dx pe | save 
G=paasye | Gucor oO. Jo Π"- 
Using the identity cos? θ = 3(1 + cos 26) this may be evaluated to give 
2dx - 
θ 20 
l eae (x — 4)? + 6/3222 [9 + 4 sin 20] + Cs 


84/3 2x — 1 4/3 2x — 1 
= — t —_—_— ---:-. --------- 
9 Ἣν tT τς 
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Hence we have shown that 
ol 84/3 2x — | 1/3 2x — | 
ee ee eee 
Goats tea) te τ to 
Adding i), 2, and Js to find / finally gives 
144/3 ΠΕΣ 2χ--ἰ 4 4x -- 5 
J/3 3(x4 — x + 1) 


A factor (x? — x + 1)? in the denominator would have led to { cos! 6 dé 
and so, in general, we would obtain antiderivatives of the form [ cos?” 6 dé. 


= 


+ C. 


8.6 Other special techniques of integration 


A great variety of different methods exist for evaluating particular types of 
antiderivative, and in this final section we illustrate only a few specially 
useful ones with the help of some examples. Extensive tables of integrals are 
readily available and, where possible, should be used to minimise tedious 
manipulation. 


8-6(a) Substitution ¢ = tan x/2 
If we write ¢ = tan x/2 it is easily proved by means of trigonometric identities 
that 

2t Ι -- 12 


sin Χ' = i+? and cos xX = I+?f (8:24) 
Using these results we can also establish the differential relation 
2dt 
dx = ------ 8:25 
ΞΡ ἢ (8:25) 


Consequently, in principle, any rational fraction R(sin x, cos x) that involves 
only the sine and cosine functions may be transformed by means of (8-24) 
into a rational fraction involving ¢. On account of this result and (8-25), 
it then follows that 


2t | — 5 | 2 dt 
l+f214+2114+2 
Thus J has been transformed into an antiderivative of a rational function 
involving ¢. 


[= J Rein x, cos x)dx = [R | 


Example 8-16 Evaluate 


[ΞΞ-- cos x dx 
ἰ τῷ 
i + sin x 
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Solution Transforming to the variable ἐ as indicated above gives 
2( -ἰ 
l= _2a-) dt. 
FJ dst Aydt) 
It is readily established that 
21-ὸ 2 21 


ΙΪΙῸ ΞΡ 14t 147 
showing that 


ioe f 2 ὦ, 
Site 1+ 2 


= log (1 + ἢ -- Ἰορ ([ τ Ἦ 22 Ἐ6Ὸ. 
Thus 


\2 
ΓΕ ΕΕΞ- 
1 + 7? 


whence from (8:24), 
I= log (1 + sinx) + C. 


3 


at 


8-6 (0) Integration of R[x, \/(ax? + bx + ὦ] 


We define R[x, \/(ax* + bx + c)] to be a rational fraction involving x and 
a/ (ax? + bx +c). Special cases of this general type in which ὁ = 0 have 
been encountered in Examples 8-2 and 8-5 where it was shown that the sub- 
stitutions x = sinu and x = sinh wu can be used to reduce the integrand to 
one involving only trigonometric or hyperbolic functions. If it is of trigo- 
nometric type then the technique of (a) above may be used to reduce the 
integrand further to a rational function. If the integrand is of hyperbolic 


type then the substitution 
t = tanh x/2, 
together with 


2: l {3 
ΤΙΞ and cosh x = 5 a2 2 


sinh x = 


and the differential relation 
2 dt 


will again reduce the integrand to a rational function. 


(8-26) 


(8-27) 


If δ 4 0, then completing the square under the square root sign gives 
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Meer + ter em falfxes)'+(E-aa)) 


The substitution u = x + (b/2a) will then reduce the problem to one of the 
two special cases just discussed, according to the signs of a and [(c/a) — 


(6?/4a”)). 
Example 8.17 Evaluate 
x 


d 
r= | /(2 — 3x — 4x2) 


Solution First we complete the square under the square root sign to obtain 


dx 
ἘΞ | V/{4[41/64 — (x + 3/8)2]}} 


Then, setting u = x + 3 this becomes 


du δὲ 
yee Og (eee near ae seca See eam ον in —— 
| area aa Hsin + 
and thus 
+ 3 
I = d arcsin (= Ἐς 


8:6 (c) Integration by means of differentiation under integral sign 

This approach utilizes the idea of differentiation under the integral sign with 

respect to a parameter. It relies on finding a known antiderivative involving a 

parameter «, say, with the property that the derivative of its integrand with 

respect to this parameter « is capable of being simply related to the integrand 

of the desired antiderivative. Specifically, the method uses the result that if 
F(x, «) = f f(x, «)dx 

then, 


OF 
a) (FE, 
oun θα 


Example 8:18 Evaluate by means of differentiation under the integral sign 
the antiderivative 


dx 
[- (x2 + g2)3/2 


Solution We first note that the integrand 1/(x2 + a2)3/? is simply related to 
the derivative 
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da | (x2 + a2)l/2 
Accordingly, let us consider the familiar antiderivative 
is oe 
Then 
δ δ 
δα | G2 aia Ba | aresinh ra c| 


and so 
2a dx ¥ 1 
ΞΙ Nate (x? + a2 0 (3) (Ga? + Ὧν 


ποὺς x Cc 
(x2 + a2)8/2 (x2 4 α5}1 ἜΠ 
The arbitrary constant C’ has been added since we are deducing an anti- 
derivative and not just an indefinite integral. 


8:6 (d) Integration of trigonometric functions involving multiple angles 


Antiderivatives of products of trigonometric functions involving multiple 
angles are of considerable importance and the most frequently occurring 
ones are: 


Π = f sin mx cos nx dx, (8:28) 
= f sin mx sin nx dx, (8-29) 
Iz = J cos mx cos nx dx. (8-30) 


These are easily evaluated by appeal to the trigonometric identities: 


sin mx cos nx = 4[sin (m + n)x + sin (m — n)x], (8:31) 
sin mx sin nx = $[cos (m — n)x — cos (m + n)x], (8:32) 
cos mx cos nx = 4$[cos (m + n)x + cos (m — n)x]. (8-33) 


Substitution of these identities into the above antiderivatives produces: 
Ι = (m—n)x  cos(m+n)x 


2 2 
asap tray | Te for mtn 


an 
Ι 


(8-34) 


l 
——cos2mx+C for m=nh, 
4m 
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I fsin(m—n)x  sin(m+ me ; . 

Al πὴ «τὴ [ΠΤ ee 
I, = 

— (mx —sinmxcosmx)+C for m= ἢ, 

2m 

1 fsin(m—n)x — sin(m a 

ποτ πὸ του σ 2 2 

Al aa rae + for m?* =n 
k= 


I 
57, (mx + sinmxcosmx)+C for m=n,. 


Example 8:19 Evaluate the following two antiderivatives: 


CH 8 


(8-35) 


(8-36) 


I, = f sin 3x cos 5x dx, Ig = f sin? 3x dx. 
Solution The antiderivatives follow immediately by substitution in (8-34) 
and (8-35): 

cos2x cos 8x x sin 3x cos 3x 
= — — Ig = = — — —— + C. 

ca ig Area 6 % 
PROBLEMS 
Section 8:1 


8:1 Find the following antiderivatives: 
3 dx ᾿ dx 
(a) | 4x2 — 16° (b) [sin 3x dx; (c) Ϊ Oye? 


dx 
(d) ize (e) |! cos 4xdx; (f) [3 dx. 


8-2 Verify by means of differentiation that 


were ao a ee νῷ -- a) |+C. 


Compare this form of result with that shown against entry 10 of Table 8.1. 


8:3 Verify by means of differentiation that 


dx Ν ldo at+ bx 
a2 — b2x2 χη 8 


τ ΓΈ: 


Compare this more general result with those shown against entries 11 and 12 


of Table 8.1. 
8-4 Verify by means of differentiation that 


᾿-π = log Ix + Via? + x9] + C. 


Compare this form of result with that shown against entry 9 of Table 8-1. 
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8.5 Use the result of Theorem 8-1 to verify the following general results: 


(a) [ Fea leis + cs 


(0) ff) gdx = ρα) — gfe ἐφ γζτσθ τ... 
+ (—1)"g™ f+ (—1)"*1, Sem) fdx; 


ω | (of )ansre 
(d) | (Ξ:. dx = log 


8:6 Apply the results of Problem 8-5 together with some slight manipulation to 
determine the following antiderivatives: 


2x sin x — x2 cos x 
(a) {ί sin? x | on 


6x? + 8x + 2 
(b) Ἰ{π τ 8 


(c) [χϑοῖ dx; 
(a) 7 ΓΞ sinh x — 3 cosh Ἢ dx. 


x cosh x 


δες 


8.7 Evaluate the following antiderivatives: 
(a) [(χ + 3 sin x + 1)dx; (Ὁ) f (4% + 2 cos 2x)dx; 
(c) [ (4 sinh x + sin x)dx; (d) [| (e% + 3)dx. 
8-8 Use the following identities to evaluate the four antiderivatives listed below: 
sinh mx cosh mx = 3[sinh (m + n)x + sinh (m — n)x] 
sinh mx sinh nx = 4[cosh (m + n)x — cosh (m — n)x] 
cosh mx cosh nx = i[cosh (m + n)x + cosh (m — n)x] 


(a) { sinh 4x cosh 2x dx; (b) f sinh x sinh 3x dx; 
(c) | cosh 4x cosh 2x dx; (d) § cosh? 2x dx. 
Section 8:2 


Use the indicated substitutions to evaluate the following antiderivatives. 


dx 
8-9 ise x= 1/u. 
8:10 [ νί( — x2)dx, x =sinu. 


tanh x dx 
« i PSE gg — 2 
8:11 ie τ hy cosh x = 1 + x2. 


| 8-12 fcosxV/sinx dx, sinx = u. 
8-13 J xBx?2  1)5 dx, 3x? 4+1=u.. 
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x dx 
8:14 Ι τὶ V(x 1) =u. 


Evaluate the following antiderivatives by means of a suitable trigonometric 
substitution. 


x? dx 
«1 en ree Ay 
μὰ |v — x?) 


8:16 { YT ax. 


δι... 
8:17 | vr ax. 


Evaluate the following definite integrals. 


1 
8:18 [ (3x + 1) sinh (x? + x + 3)dx. 
0 


1 
8:19 | x56/(1 + x?)dx. 
0 


6 
8:20 { να — 2)dx. 
2 
2 4x + 6 
8:21 { (- rg Pa i} dx. 


Section 8-3 
Evaluate the following antiderivatives using the technique of integration by parts. 
8:22 [665 sin x dx. 


8:23 § xe% dx. 
8:24 [ ταῖς 
sin? x 
8:25 {sin x sinh x dx. \ 


8:26 { 7% cos x dx. 
8:27 § log? x dx. 


8-28 { x arcsin x dx. 


Section 8-4 
ey Given that Jn = J (1 — x)" dx, where πὶ is an integer, show that 
(3n + Ο Ὁ, = x1 — x8)* + 3n In-1. 
Hence prove that 


1 
Ϊ (1 — x3)5 = 36/24.7, 13. 
JO 
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8:30 The integral Im is defined by 


oO xem-l . 
| ΞΞ GE pers d* for integral m > 0. 
0 
Show that 
2 
Im-1 = a In, 
m—l 


and by using the substitution x = tan @ prove that 


1 


ἐπ 
in? 6cos® 0 40 = —- 
Ι sin’ θ cos® 6d 150 


8-31 Show that for integral n > 1, 


ἐπ ἐπ 
{ x™sinxdx =n | x"-1 cos x dx 
0 0 


and 


ὁπ ἐπ 
Ϊ x" cos x dx = (: πὴ" -- n| x"! sin x dx, 
0 0 


Use the result to evaluate 


ἀπ 
Ϊ x? cos x dx. 
0 


8.32 The function Jp,q is defined by 
Ip,q = § x? (log x)* dx 
in which p,g are positive integers. Show that 
(p + WIp.g + 4 Ip.q-1 = x?*} (log x). 
8-33 If 
Tn = Ϊ tan” 6 dé, 
where n # 1 Is a positive integer, show that 


Use this result to evaluate 


hair 
| tan® 6 ἀθ. 
0 


8:34 The function Jm,n is defined by 
Iingn = S x™ a + bx)” dx, 
in which m,n are positive integers. Prove that 
bOm +n + πίη + ma Im-i1jn = X™a + bx)", 
8:35 The function Jm,n is defined by 
In,n = § sin” 6 cos” 6 40, 


in which 7,” are positive integers. Show that Jm,n satisfies the reduction 
formula 7 
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(m + n)Imn — (n πὸ 1)Im,n-2 = sin™*1 x cos*-! x, 


Section 8.5 
Fvaluate the following antiderivatives by means of partial fractions. 
8:36 = 


(x — 1x + Dix +3) 


Ἅ᾽ς-- 
8.37 (aaa dx. 


x2? — 5x +6 
838 | oer: 
ia aa = ia : 
84 [τ 


dx 
“41 SS ————_—_—_—_——_——————— | 
es perme reer 


4 _ 3 2 
4.4) | (OX Set 4. ἐκ 
2x2 — x + ἢ 


x? 4+ 2 
(x + 1)3ἃ -- 5) 4 
gag [xi 4.3 Ὁ Mx? Ὁ 12x48 
(x2 + 2x + 3)7%(x + 1) 


8°43 


dx. 


Section 8-6 
Evaluate the following antiderivatives by means of the substitution ¢ = tan x/2. 


dx 
se 1: τ π 
4“ {ππττοσπ 
sin X + cos x 
dx 
5817 ἀρνρῤῥόμα 


sin x 
os oor: - coe 


Evaluate the following antiderivatives by means of one or more suitable sub- 


stitutions. 


dx 


8.49. | UO + 3x -- 2x) 
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3x -- 6 aig 
VJ (x? — 4x + 5) 


dx 
est [os 


8-52 [ν (χϑ- 2x + 5) dx. 
dx 

8.53 | -------Ὁ---ς-ςς͵ς. 

Ϊ Ge -- νὰ -- 2) 


8.54 [ν΄ (ὰ -- x?) dx. 


Use the technique of differentiation under the integral sign to evaluate the 
following antiderivatives. 


8.55 | xe®* dx. 


8:50 


8:56 | ea (Hint: Start from | πε ae in Table 8-1.) 


8:57 | Ga aie (Hint: Start from | sa it Table 8-1.) 


8.58 [χα dx. (Hint: Start from J αὖ dx in Table 8-1.) 
Evaluate the following trigonometric antiderivatives. 
8.59 { cos x cos 2x dx. 


8-60 { sin ax sin (ax + ©)dx, a, € non-zero constants. 
8.61 { cos x cos? 3x dx. 


8.62 { sin x sin 2x sin 3x dx. 


Use the results of this chapter together with Definitions 7-4 and 7-5 of Chapter 7 
to classify the following improper integrals as convergent or divergent. Determine 
the value of all improper integrals that are convergent stating any conditions that 
must be imposed to ensure this. 


1 
8:63 = 
o * 
8-64 dx : 
‘ IlI+x 
= dx 
8-65 -----------.--. 
υ (+ χ)νχ 


8-66 { cos x dx. 
0 


1 dx 
867 | 1 Ξ 


οο 
8-68 [ et dx. 
0 


Linear transformations 
and matrices 


9:1. Introductory ideas 


This chapter is concerned with the branch of mathematics known as Jinear 
algebra. One aspect of this subject has already been encountered, namely 
vectors, and it is now necessary to develop in a more general context various 
of the ideas that were first introduced there. Central to the entire subject is 
the fundamental idea that the algebraic operations of addition, subtraction, 
and multiplication can be made meaningful when applied to an array of 
numbers or functions considered as a single entity. 

An example will help here to indicate one of the many different ways in 
which such an array may arise, and at the same time to show something of 
the type of algebra it 1s reasonable to want to perform on an array. Three 
chemical plants numbered | to 3 each have separate sources of raw material 
from which each one produces the same four products numbered | to 4. Let 
plant number m produce product number ἢ at a cost @mn units per ton, then 
the production costs of the complex of chemical plants is conveniently 
characterized by the following table of the twelve quantities a@mn. 


Table 9-1 
Product 
1 2 3 4 
I au a12 a3 a4 
Plant 2 451 a22 a23 a24 
3 a31 a32 a33 a34 


In writing this table or array of quantities @mn we have used the convention 
that the first of the two suffixes attached to the quantity d@mn refers to the row 
number in which a@mn appears, and the second to the column number. Thus 
the entry aes occurs in row 2, column 3, whilst the entry a32 occurs in row 3, 
column 2. The important use of suffixes in this way is strictly analogous to a 
map reference in which the first entry is a latitude and the second a longitude. 
Thus the double suffix notation used here serves to identify the position in 
the array to which the associated quantity is assigned. 

On account of the use to which the suffixes have been put, we can now 
dispense with the extreme left-hand column and the top row of Table 9-1, 
which only serve for identification purposes, and write instead 
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ait d4i2 G13 @14 
A = [21 d22 (28 a2a|, (9-1) 
431 432 433 434 
with the understanding that the symbol A represents the array of quantities 
originally contained in Table 9-1. 
Returning now to the physical situation from which the array (9-1) was 
derived, let us suppose that at some time the quality of the raw materials 
changes, so that a revised Table 9-1 then applies in which entry amn is replaced 


by the new entry bm. Then, in terms of our concise notation, we can 
characterize this new situation by defining an array B as follows: 


biy by bi3 bia 
B= | bei bee bes bea). (9:2) 


δ8ι 532 b33 baa 
In terms of the information at our disposal, we know that the change in 
the cost of product n from chemical plant m is @mn — bmn, whilst the average 
cost of product ἡ from plant m is $(@mn + bmn). Hence, if C is the array of 


change in costs of products and D is the array of the average costs of 
products, in our new notation we may write: 


443 — by, αι — big aig — big aia — Dia 
C = | doi — bai (55 — bee (93 — δο3 (24 — boa (9-3) 


a3; — θ8ι (35 — b32 a33 — 633 asa — 34 


$(ai1 + δι1) ξίαιο + biz) 4(aia + bis) Baia + δ14) 

D = | (οι + bei) $(de2 + b22) 4(a23 + bes) 4(ae2q + bead}. (9-4) 
3(431 + 631) 3(a32 + 532) 4(as3 + bss) ξ(αβ4 + 6534) 

The form of these results is suggestive, for it would seem that by defining 


subtraction of two similar arrays to mean the array formed by the subtraction 
of corresponding elements, we may write 


CHAR: (9-5) 


Similarly, if addition of two similar arrays is taken to mean the array 
formed by the addition of corresponding entries, and the multiplication of an 
array by a factor is taken to mean the array formed by the multiplication of 
each entry by that factor, we may write 


D = KA + B). (9-6) 


380 / LINEAR TRANSFORMATIONS AND MATRICES CH 9 


Hence, in a natural manner, we are starting to perform what appears to 
be conventional algebraic operations on an entire array of numbers, rather 
than on the individual entries in the arrays themselves. In mathematical 
terms an array of the form shown on the right-hand side of Eqn (9-1) is called 
a matrix of order (3 x 4). Here, analogous to the double suffix notation 
already introduced, the first number is taken to refer to the total number of 
rows in the matrix and the second number to refer to the total number of 
columns in the matrix. 

In terms of the simple physical situation used to introduce the notion of a 
matrix and its associated algebra we have so far given no indication of the 
interpretation to be placed upon multiplication. To elucidate the form taken 
by this operation when applied to matrices, we again return to our physical 
situation and consider the cost of buying ci, c2, c3, and cq tons, respectively, 
of products I, 2, 3, and 4 from each of the three chemical plants in turn. If 
the product costs are as shown in Table 9-1, and the costs of the orders are 
denoted by dj, do, and dz, it is readily seen that 


dy = a1101 + ay2C2 + discs + araca 
dz = a21C1 + (2505 + (5803 + d2ac4 (9-7) 


d3 = 43101 + 8505 + as3c3 + agaca. 


In terms of the matrix A in Eqn (9-1), the right-hand side of the first 
equation in (9-7) is obtained by multiplying successive entries in the first row 
of A by c1, 05, cs, and (4, respectively, and then adding the four products. 
The same process will generate the right-hand side of both the second and 
third equation in (9-7), provided that the entries in the second and third rows 
of matrix A are used in place of those in the first row. If the four numbers 
C1, C2, C3, and c4 are arranged in a column which 15 then regarded as a (4 x 1) 
matrix, the basic operation of matrix multiplication is seen to be the multi- 
plication of a row of the first matrix into the column of the second to yield a 
single number. Thus, in terms of the first row of A expressed as a (1 x 4) 
matrix, we have the definition 


τισι + a12C2 + ai3c3 + Q14c4 = [ai1 15 dis aia] : 


where juxtaposition is used to imply multiplication of the row and column 
matrices on the right-hand side. 

Similarly, in terms of the second row of A expressed as a (1 x 4) matrix, 
our definition yields 
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Cl 


C2 
2101 + ae2ce + ae3c3 + d2aca = [451 G22 623 (94) : 
(8 


C4 


and a corresponding result is also true for the third row of A when expressed 
as a (1 x 4) matrix. This special form of product is called either the inner 
product or the scalar product of a row matrix and a column matrix. 

Collectively these results suggest that we should write Eqns (9-7) in the 
matrix form 


C1 

ay 411 412 418 414 
C2 

dg} = [οἱ G22 23 aaa ὃ (9-8) 
C3 

ds a3, 835 433 (434 " 


with the understanding, as before, that multiplication is implied by juxta- 
position and means the inner product of rows of the first matrix with the 
column of the second matrix. To be consistent, equality of two matrices must 
then be taken to mean the equality of corresponding entries in two matrices 
of similar order. Using this convention our suffix notation works for us in 
the sense that the row number and the column number, taken in that order, 
which are involved in an inner product are the row and column numbers of 
the location into which that product is to be put. Thus in matrix equation 
(9-8), the number dz is in row 2, column 1 of the left-hand column matrix, 
and it is the result of forming the inner product of row 2 of the first matrix 
on the right-hand side with column 1 of the second matrix. (The second 
matrix here only has one column.) 

If the column matrix with entries di, dz, ds is denoted by D, and the column 
matrix with entries ci, C2, 03, and ca is denoted by C, then Eqn (9-8) can be 
reduced to the deceptively simple equation 


D = AC. (9-9) 


It should be noticed that the resemblance to the algebra of real numbers ends 
here, because although multiplication is a commutative operation for real 
numbers, it is an easy task for the reader to verify that the matrix product 
CA is not even defined for the matrices involved here. Later we shall see that 
the non-commutative character of matrix multiplication is not the only 
difference between the field of real numbers and matrices. The result of matrix 
multiplication using numbers is illustrated in the following example: 
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1210 7 4 
! 

0113 a ἜΣ, ἢ 
0 

ι21.4} 0 


We remark in passing that the name scalar product of a row matrix and 
a column matrix derives from a comparison with the scalar product of two 
vectors. Namely, if α = ai + ας] + ask, B = βιὶ + Pej + Bsk are two 
vectors, thena .B = «181 + a2f2 + «383, which is just the result of forming” 
the inner product of a row matrix with entries «1, αν, «3 and a column matrix 
with entries £1, B2, 63. Because of this similarity it is customary to refer to 
matrices comprising only one row or one column as row vectors or column 
vectors, respectively. Thus a general (1 x 5) row vector may be considered as 
a matrix representation of an ordinary form of vector having n components, 
and which belongs to an n-dimensional space. 

This simple idea proves to be very fruitful in more advanced accounts of 
linear algebra where it leads to the study of what are called n-dimensional 
vector spaces. These spaces have properties very similar to those discussed in 
Chapter 4 and, as in three dimensions, the scalar product is related to the 
geometrical operation of projection in the space. In an n-dimensional vector 
space a fundamental set of row or column vectors called a basis takes the 
place of the unit vectors i, j, and k and lead to the important idea of linear 
independence which will be examined Jater. 

Because of the shape of the array, a general (m Χ n) matrix is called a 
rectangular matrix. The rule just devised for the product of a (3 x 4) matrix 
and ἃ (4 x 1) column vector also applies to the product AB of two rectangular 
matrices A and B, provided only that the number of columns in A 15 equal to 
the number of rows of B. This last requirement follows directly from the 
concept of an inner product which is only defined when the number of entries 
in a row of A is equal to the number of entries in a column of B. Once again 
the suffix notation works for us, because the inner product of row p of matrix 
A and column g of matrix B is the number cpg, which is found in row p and 
column q of the product matrix C = AB. Consider the following example 
which illustrates the application of this rule: 


ι2106Ὶ] 2 4 
1 2 

O11 3 - -- 71. 
0 2 

ι-2 1 4]. ...ὺ 0 11 


Then, for example, the entry in row 3, column 2 of the product matrix is the 
number 11, which is the inner product of row 3 of the first matrix involved in 
the product and column 2 of the second matrix involved in the product. 
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Notice that the rule for forming an inner product also determines the 
shape of the product matrix C = AB, for C must have as many rows as A 
and as many columns as B. (Think about this and check it.) In fact these 
arguments may be formulated into a useful short-hand rule for checking that 
two matrices are conformable for multiplication, and at the same time 
displaying the shape of the product matrix. 


Rule 1 (Multiplication conformability rule) 


If Ais an (m x n) matrix and Bis a(p Χ 4) matrix, then the matrix product 
AB may be formed provided n = p. The resultant product matrix then has 
the form (m x 4). Symbolically we write this 


(mx np xq)=(mxq) onlyif n=p. 


Thus matrix products of the form (3 x 7)(7 x 2) are conformable for 
multiplication and yield a (3 x 2) matrix. Matrix products of the form 
(7 Χ 3)(5 x 4) are not defined and certainly do not yield a (7 x 4) matrix. 

This rule has various important implications, and at this stage in our 
argument we would draw attention to the fact that even when for two matrices 
A and B, both the matrix products AB and BA are defined, they are not 
usually equal. Indeed, the order of the two product matrices may be different, 
as the following example shows. If 


1... 1 
ΑΞ [0 “--1|, -| 
—iI I 0 
4 ] 
then 
—] 4 
5 l 
AB = 1 —! O and BA =| | 
“1 —3 
3 9 


A different but most important way in which matrices can arise is in 
dealing with sets of simultaneous equations. Consider the following set of 
simultaneous equations: 


Xx+y+2z=4 
2x —y+t+3z=9 
3χ-πν-- z=2. 
These equations may be written in matrix form by introducing a column 


vector with entries x, y, z and then using the rule of matrix multiplication to 
write 
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l ] 2|[χ 4 
2 --ἰ 3i;y| ΞΞ 9 
3 πὸῦΆὶ --1}} 2 2 


With only a little practice, the reader will quickly learn to transcribe systems 
of equations into matrix form, for the patterns of numbers involved in the 
two numerical matrices are identical to the patterns of numbers in the 
equations themselves. 

For obvious reasons the (3 x 3) matrix is called the coefficient matrix of 
the simultaneous equations. As in this case there are three equations and 
three unknowns, the coefficient matrix is square in shape. In general the name 
square matrix will be given to any (n Χ n) matrix. If the coefficient matrix 
above is denoted by A, and the column vectors with entries x, y, z and 4, 9, 2 
are denoted, respectively, by X and K, we arrive at the matrix equation 


AX = K. 


There is a great temptation to attempt to solve this for X by dividing by 
A, but as it is meaningless to divide two arrays of numbers this approach 
must be abandoned. Later we will return to this matter and resolve the diffi- 
culty by introducing the concept of the inverse of a square matrix via the 
operation of multiplication. 

One final and important way in which matrices may arise is in connection 
with what are called Jinear transformations. The idea involved here is perhaps 
best understood if described in terms of coordinate transformations, and for 
this purpose we now confine attention to a special change of coordinates in 
a plane. | 

Suppose a set of rectangular cartesian axes O{x’, y’} in a plane is derived 
from a set of rectangular cartesian axes O{x, y} by rotation about O through 
an angle 0. Then under this process a point P in the (x, y)-plane with co- 
ordinates (€, 4) appears as a point with coordinates (ξ΄, 7’) in the (x’, y’)- 
plane, as shown in Fig. 9-1. 

Simple geometrical considerations show that 


&’ = €cos 6 — η sin 0 

yn = &sin@ + ncos 6. 
Now this result is true for any point P in the (x, y)-plane and its map in the 
(x’, y’)-plane, so that with complete generality we may display the effect of 
this coordinate transformation by writing 

x’ = x cos -- ysin 0 

y =xsin 0 + ycos 9. (9-10) 

If the axes O{x’, γ΄} and O{x, y} are thought of as belonging to two differ- 
ent but superimposed planes with a common origin, then Eqns (9-10) may 
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y Fig. 91 Rotation in a plane. 


be regarded as describing the relationship between points in each plane 
when corresponding axes are inclined at an angle @. In this respect the 
transformation described by Eqns (9-10) can be regarded as a function or 
mapping, in the sense of Chapter 2, of the set of points comprising the 
(x, y)-plane into the set of points comprising the (x’, y’)-plane. The mapping 
is obviously one to one, and both the domain and range of the mapping is 
the set of points comprising the plane itself. In matrix notation the relationship 
becomes 


Bod cos§ --οη ΘΓχ 
oe : (9-11) 
: sin 9 cos 6] y 


Hence by pursuing the simple idea of the geometrical operation of the 
rotation of a plane about the origin we have arrived at the matrix 


cos@ —sin@ 
R, = ; (9-12) 
sin 6 cos 9 


The idea involved here is a much more general one than that involved in 
simultaneous equations, since R, contains a complete description of how an 
entire plane transforms or maps, together with whatever specific curves of 
interest it may contain. In addition to this we have also produced an example 
of a matrix whose entries, or elements as we shall call them henceforth, are 
functions of a single real variable. 

Accordingly, it is reasonable to ask whether any meaning can be given to 
the entity dR,/d0, where R, is a matrix whose elements are functions of the 
real variable θ. This is not an abstract matter, for in mechanics and many 
other subjects it is frequently convenient to work with axes that are fixed in 
a rotating body. Indeed, the same sort of idea was implicit in the example 
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first used to introduce matrices. In that case by regarding the quality of the 
raw material as a function of the time /, we arrive at a (3 x 4) matrix A(s) 
whose elements dmn(t) are functions of time and any attempt to examine 
rates of change involves considering the meaning of dA/dr. 

The term linear transformation in relation to the rotation transformation 
(9:11) comes about as follows. Consider the effect of a rotation 6 on the two 
points («, 8) and (y, 0) which map into the points (x’, β΄) and (y’, 6’), 
respectively. Then from Eqns (9:10) we have 


/, 


a’ = acos 4 — βεῖπ θ d 
an 
β΄ = asin + Bcos 6 δ΄ = ysin 8 + 6cos 8, 


γ᾽ =ycos@ — δεῖπ θ 


whence 


a + γ᾽ = («+ y)cos 0 — (β + δ) sin 9 
β΄ + δ΄ = (a + y) sin 6 + (β + δ) cos 6. 


So, setting 


αἰ} =f 


we have in fact shown that 


which asserts that multiplication by R, is distributive with respect to addition. 
It is the general property described by Eqn (9-13) that is used to characterize a 
linear transformation, and it is on account of this that R,X is called a linear 
transformation of the vector X. In fact matrix multiplication is always 
distributive with respect to addition, as we shall see later. 

Thus far in our introductory presentation of matrices only intuitive argu- 
ments have been used. This approach has been adopted deliberately in an 
attempt to emphasize that matrices arise naturally, and that an obvious 
algebra suggests itself for their manipulation. To proceed further it now 
becomes necessary to formalize these ideas in exact mathematical terms, and 
then to develop them in systematic form to the point at which they can be 
used as a useful tool. 


9.2 Matrix algebra 


In this section we return to the fundamental ideas connected with matrices 
and their algebra which were outlined on an intuitive basis in Section 9-1. 
This time, however, our discussion will be more formal and, relying on our 
introductory account to provide motivation, we shall proceed quickly 
through the basic definitions and theorems, which will be illustrated by 
example. The problem of the solution of systems of linear equations and a 
discussion of linear transformations and some of their applications will be 
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presented in subsequent sections. 


DEFINITION 9:1 (matrix and its order) A matrix is a rectangular array of 
elements aij involving m rows and n columns. The first suffix i in element aj; is 
called the row index of the element and the second suffix 7 is called the column 
index of the element. These indices specify the row number and column 
number in which the element is located, with row 1 occurring at the top of 
the array and column | at the extreme left. A matrix with m rows and n 
columns 15 said to be of order m by n and this is written (m x n). The order 
describes the shape of the matrix. 


Special names are given to certain types of matrix and we now describe 
and give examples of some of the more frequently used terms. 

(a) A row matrix or row vector is any matrix of order (1 Χ 7). The 
following is an example of a row vector of order (1 x 4): 


[3 0 7 2]. 


(b) A column matrix or column vector is any matrix of order (n x 1). The 
following is an example of a column vector of order (3 x 1): 


1 
2; 
5 


(c) A square matrix is any matrix of order (n x n). The following is an 
example of a square matrix of order (3 x 3): 


Ι.2 4 
a: 0. 2 
> ob. 3 


Three particular cases of square matrices that are worthy of note are the 
diagonal matrix, the symmetric matrix and the skew-symmetric matrix. Of 
these, the diagonal matrix has non-zero elements only on what is called the 
principal diagonal, which runs from the top left of the matrix to the bottom 
right. The principal diagonal is also often referred to as the leading diagonal. 
The following is an example of a diagonal matrix of order (4 x 4): 


300 0 
000 0 
002 0 
000 5 


The diagonal matrix in which every element of the principal diagonal is a 
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unity is called either the unit matrix or the identity matrix, and it is usually 
denoted by I, The unit matrix of order (3 x 3) thus has the form 


1 0 0 
I=/0 1 O}. 
0 0 1 


A symmetric matrix is one in which the elements obey the rule ay = ajz, 
so that the pattern of numbers has a reflection symmetry about the principal 
diagonal. A typical symmetric matrix of order (3 x 3) is: 


5 I 3 
| 2 —2 
3 -2 7 


A skew-symmetric matrix is one in which the elements obey the rule 
aij = —ay, so that the principal diagonal must contain zeros, whilst the 
pattern of numbers has a reflection symmetry about the principal diagonal 
but with a reversal of sign. A typical skew-symmetric matrix of order (3 x 3) 
iS: 


0 1 9 
-"1Ὸ0 —3]. 
—5 3 0 


(d) A null matrix is the name given to a matrix of any order which con- 
tains only zero elements. It is usually denoted by the symbol 0. The null 
matrix of order (2 x 3) has the form | 


00 0 
0= : 
00 O} 
DEFINITION 9-2 (equality of matrices) Two matrices A and B with general 


elements aj; and δι), respectively, are equal only when they are both of the 
same order and ai; = bi; for all possible pairs of indices (i, /). 


Example 9.1 Is it possible for the following pair of matrices to be equal 
and, if so, for what value of a does equality occur: 


5 @ 5 —27 
and : 
a2 | 9 l 


Solution The matrices are both of the same order and hence they will be 
equal when their corresponding elements are equal. As corresponding ele- 
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ments on the principal diagonal are indeed equal, we need only confine atten- 
tion to the off-diagonal elements. Thus the matrices will be equal if there is a 
common solution to the two equations a* = 9 and a? = —27, Obviously, 
equality will occur if a = —3. 


DEFINITION 9:3 (addition of matrices) Two matrices A and B with general 
elements aj; and bij, respectively, will be said to be conformable for addition 
only if they are both of the same order. Their sum C = A + B is the matrix 
C with elements cij = ayy + δι. 


As addition of real numbers is commutative we have aj; + bij = by + ayy. 
This shows that addition of conformable matrices must also be commutative, 
whence 


A+B=B+A. (9-14) 


Now addition of real numbers is also associative so that (aij + bij) + 
Cij = ai + (by + cy). Hence if ai, bi, and ci are general elements of 
matrices A, B, and C which are conformable for addition, then this also 
implies that addition of matrices is associative, whence 


(A+B)+C=A-+ (B+ C). (9:15) 
Results (9-14) and (9-15) comprise our first theorem. 
THEOREM 9-1 (matrix addition is both commutative and associative) If 
A, B, and C are matrices which are conformable for addition, then 


(a) A+ B=B+A_ (Matrix Addition is Commutative); 
(Ὁ) (A+ B)+C=A+(B+C) (Matrix Addition is Associative). 


Example 9-2 Determine the constants a, δ, c, and d in order that the 
following matrix equation should be valid: 


0 a 3 ὁ ἃ ὦ 43 5 
at = . 
bd 2 [1 ὦ 7.5.5 
Solution Adding the two matrices on the left-hand side we arrive at the 
matrix equation 


c (a + 1) 5 3. 5 

(b+1) 3. £(d+2)} 13 5) 

Equating corresponding elements shows that a=2, b=6, c= 4, and 
d = 3. 


DEFINITION 9-4 (multiplication by scalar) If k is a scalar and the matrix 
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A has elements aj;, then the matrix B = ΚΑ is the same order as A and has 
elements Και). 


Example 9.3 Determine 2A + 5B, given that: 


: 2 —|] 3 
A= and B= . 
3 4 4 2 


1 2 -1.3 
Solution 2A + 5B -Ξ 2 + 5 
3 4 4 2 


| 2 4 —5 15 
2A + 5B= + 
6 8 20 10 


—3 19 
2A + 5B = 
26 18 


or, 


whence 


DEFINITION 9:5 (difference of two matrices) If the matrices A and B are 
both of the same order, then their difference A — B is defined by the relation 


A—B=A+(—DB. 


Example 9.4 Determine A — B, given that: 


3 4 7 
Α-τ-|4 --2 and B= | 3 ] 
Ϊ 6 0 -- 
Solution 
| 3 4 2 
A—B=/4 -—2|+(-—1)]3 11, 
| Ι 6 0 —2 
and so 
Ι 3 —4 -- —3 I 
A—B=(4 —2{+|-3 —-l{= 1 —3}. 
I 6 0 2 I 8 


DEFINITION 9-6 (matrix multiplication) The two matrices A and B with 
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general elements aj; and bj; are said to be conformable for matrix multiplica- 
tion provided that the number of columns in A equals the number of rows in 
B. If A is of order (m x n) and Bis of order ( X r), thén the matrix product 
AB is the matrix C of order (m Χ r) with elements c;;, where 


Cy = Qiibiy + aigbaj + °° + + + ainbn;. 


The number cj is called the inner product of the ith row of A with the jth 
column of B. 


Example 9.5 Determine A + BC, given that: 


1 4 1 4 2 
A= : B= : and C= 
2 3 211 


Solution Matrix B is of order (2 x 3) and matrix C is of order (3 x 2), 
showing that BC are conformable for multiplication. We have 


3 4 
1 4 2 7 8 
BC = 1 Oj = : 
211 7 10 
0 2 
and so 


1 4 7 8 8 12 
A+BC= + = 
2. 3 7 10 9 13 


On account of the fact that matrix multiplication is not normally com- 
mutative, it is important to use a terminology that distinguishes between 
matrix multipliers that appear on the left or the right in a matrix product. 
This is achieved by adopting the convention that when matrix B is multiplied 
by matrix A from the /eft to form the product AB, we shall say that B is pre- 
multiplied by A. Conversely, when the matrix B is multiplied by A from the 
right to form the product BA, we shall say that B is post-multiplied by A. 

The most important results concerning matrix multiplication are con- 
tained in the following theorem, which asserts that matrix multiplication is 
distributive with respect to addition and that it is also associative. 


THEOREM 9-2 (matrix multiplication is distributive and associative) If 
matrices A, B, and C are conformable for multiplication, then: 
(4) matrix multiplication is distributive with respect to addition, so that 


A(B + C) = AB + AC; 
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(b) matrix multiplication is associative, so that 
A(BC) = (AB)C. 

Proof To establish result (a) let B and C be of order (m x n), and denote 
their general elements by bi; and ci, respectively, so that the general element 
of B + Cis by + ci. Then if A is of order (r x m) with general element aj, 
and dj;, is the general element of D = A(B + C) which is of order (r x n), 
we have from Definition 9-6 that 

diy = ain(biz + €1j) + ai2(bay + 025) ++ + + + Gim(bmg + mj). 
Performing the indicated multiplications and re-grouping we have 

dig = (α1θ1; + Qi2bey “ἘΠῚ + + Gimbmy) + (αμοι) + aiaces + 

"+ + + AimCm}). 

However, from Definition 9-6 this is seen to be equivalent to 

D = AB + AC, 


which was to be proved. 
Result (b) may be established in similar fashion, and to achieve this we 


assume A, B, and C to be respectively of order (p x 4), (g Χ m), and (m x n) 
with general elements ajj, bij, and ci. 
From Definition 9-6 we know that the general element occurring in row i, 
column j of the product BC has the form 
biic1y + bigces + °° + + bimemi, 
so that the general element dj; occurring in row i column j of the product 
D = A(BC) which is of order (p Χ n) must have the form 
diy = απιίδιιοι; + biece; ++ + + + BimCmj) 
+ ajyo(beiciy + beeces ++ + > + bamemy) 
+ aiglbgicis + Dg2ce3 + + + + + bamCmj). 
Re-grouping of the terms then gives 


diz = (airbii + Gigbe1 + + + + Gighgi)c1; 
+ (aibi2 + aigbeg ++ + + + Gighg2)c2; 
+ (aibim + aizbem + °° + " + aigham)cmj. 


Appealing once more to Definition 9-6 we find that this is equivalent to 
D = (AB)C, 


which was to be proved. 
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Example 9-6 If 


1 3 2 1 
A=[l 2], =|; | c= | ἢ 


verify that 
(a) ΑΒ + C) = AB + AC, 
(Ὁ) A(BC) = (AB)C. 


Solution 
(a) We have 


3 4 
B+C= 
2 3 


so that 
A(B + C 1 2 εἶν 7 10); 
(B+COQ=[ if = [ }; 


whereas 

AB=[-I1 7] and AC = [8 3], 
so that 

AB + AC = {7 10]. 


(b) We have 
1 33/2 1 11 4 
BC = — 
alls els a 
so that 


» 


ru 4 | 
A(BC) = [Ii a} , i = [19 6]; 
whereas 
1 3 
AB = [1 a =[-—1 7], 
whence 


(AB)C al 19 6 
=i |, {πὶ ]. 
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An important matrix operation involves the interchange of rows and 
columns of a matrix, thereby changing a matrix of order (m x n) into one of 
order (n X m). Thus a row vector is changed into a column vector and a 
matrix of order (3 Χ 2) is changed into a matrix of order (2 x 3). This | 
operation 1s called the operation of transposition and is denoted by the | 
addition of a prime to the matrix in question. 


DEFINITION 9:7 (transposition operation) If Aisa matrix of order(m x n), 
then its transpose A’ is the matrix of order (μη x m) which is derived from A 
by the interchange of rows and columns. Symbolically, if ai is the element in 
the jth row and jth column of A, then aj is the element in the corresponding 
position in A’, 


Example 957 Find A’ and (A’)’, given that: 
1 47 3 
A= : 
2 - 4 —!1 


Solution Writing the first row in place of the first column and the second 
row in place of the second column, as is required by Definition 9-7, we find 
that 


; ὦ 
rs 
Α' τῷ 
7 4 
3 -1 


The same argument shows that 


] 4 7 3 

(Α΄ 7 - ᾿ 

2—-1 4 - 

It is obvious from the definition of the transpose operation that (A’)’ = A, 
as was indeed illustrated in the last example. It is also obvious from 
Definitions 9:3 and 9-5 that if A and B are conformable for addition, then 

(A + B)’ = A’ +B’. (9-16) 


Now if A is of order (m x n) and B is of order (” x 7), and the general 
matrix elements are aij and bij, respectively, the element cy in the ith row and 
jth column of the matrix product C = AB is 


Cy = ai1b13 + aizbaj + Γ᾿" + + Ginbnj. 


By definition, this is the element that will appear in the jth row and ith 
column of (AB)’. 
Applying the transpose operation separately to A and B we find that A’ 
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is of order (n X m) and Β΄ is of order (r X n), so that only the matrix product 
Β΄Α΄ is conformable. 

Now the elements of the jth row of Β΄ are the elements of the jth column 
of B, and the elements of the ith column of A’ are the elements of the ith 
row of A, so that the element dj; in the jth row and ith column of the product 
D = BA’ must be 


dye = θάμ + bejagg ++ + + + Onjain 
or, equivalently, 
dy, = ayibiy + aigbe3 4+ + * + ainbnj. 


However, equating elements in the jth row and ith column of (AB)’ and 
Β΄Α΄ we find that ci = dj, and so 


(AB) = B’A’. (9:17) 


We summarize these results into a final theorem. 


THEOREM 9:3 (properties of transposition operation) If A and B are con- 
formable for addition or multiplication, as required, then: 


(a) (A’)’ = α (Transposition is Reflexive); 
(Ὁ) (A + BY’ = A’ +B’; 

(c) (A — B)’ = A’ —B’; 

(d) (AB)’ = B’A’. 


Example 98 Verify that (AB)’ = B’A’, given that: 


1 3 2 —1 
a= | and B= ; 
2 4 3 1 


Solution We have 


CR 11 2 
AB = = 
2 4|[3 1 16 2 


so that 


: 2 37ΠΓ| 27 (1h 16 
B’A’ = = 
Laie te © 


which is equal to (AB)’. 
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9.3 Determinants 


The notion of a determinant, when first introduced in Chapter 4, was that of 
a single number associated with a square array of numbers. In its subsequent 
application in that chapter it was used in a subsidiary role to simplify the 
manipulation of the vector product, and in that capacity it gave rise to a 
vector. The determinant made yet another appearance in Chapter 5 when, in 
connection with the change of variable in partial differentiation, it contained 
functions as elements, and was called a Jacobian. In this role it 1s often called 
a functional determinant, and it gives rise to a function that is closely related 
to the one-to-one nature of the change of variables involved. 

These are but two of the situations in which determinants occur in 
different branches of mathematics, and it is the object of this section to 
examine some of the most important algebraic properties of determinants. 
Our results will only be proved for determinants of order 3 but they are, in 
fact, all true for determinants of any order. 

We begin by rewriting Definition 4-16 using the matrix element notation 
as follows: 


DEFINITION 9-8 (third order determinant) Let A be the square matrix of 
order (3 x 3) 
411 412 413 
A= [οι 22 423}. 
agi 32 433 
Then the expression 
Q@i1 412 413 
|A| =| (οι dee (53 
431 (32 433 


is called the third order determinant associated with the square matrix A, 
and it is defined to be the number 


a22 23 σι (1538 51 (92 
[Α [|ΞΞ an — ai + aig : 
a32 33 81 433 81 (82 


where for any numbers a, ὁ, c, and ὦ, 


a ὃ 
= ad — be. 
Cc 


The notation det A is also frequently used in place of | A| to signify the 
determinant of A. 
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This definition has a number of consequences of considerable value in 
simplifying the manipulation of determinants. Let us confine attention to the 
third order determinant which is typical of all orders of determinant, and 
expand the last line of Definition 9-8. We have 


air 412 413 
a21 d22 23 = 411022033 — 411423032 + 412423031 

— 12021433 + 413421a32 — 413422431, (9°18) 
a31 a32 433 | 


showing that one, and only one, element of each row and each column of the 
determinant appears in each of the products on the right-hand side defining 
| A |. Hence, if any row or column of a determinant is multiplied by a factor 
A, then the value of the determinant is multiplied by A, since a factor Ἢ will 
appear in each product on the right-hand side of Eqn (9-18). Conversely, 
if any row or column of a determinant is divided by a factor λ, then the value 
of the determinant is divided by A. It is also obvious from Eqn (9-18) that 
| A | = Oif all the elements of a row or column of | A | are zero, or if all the 
corresponding elements of two rows or columns of " A | are equal. 

Suppose, for example, that A = 3 and 


Pa τ" ς S| 
1.2 3 | a ΩΣ, 
43 : ΕἾ | 2 | 
Ὁ } ! 
Then it is easily shown that | A| = —5, so that 3| A| = —15. Now this 


result could have been obtained equally well by using the above argument 
and multiplying any row or any column of | A | by 3. If the first row of | A | is 
multiplied by 3 we have / Ἂ 


f . δ mee Ἢ τῇ ' 
| ap eel See 
3.6 9 ΠΝ | / 
47 Ae a eats Ἔα 
ὶ ὁ [ La ὦ / 
4 1 2 | | Ps 
| γι “2 saa ᾿ τ. 4 
or, alternatively, if the third column is multiplied by 3 we have 
Ι.2 9 
3;/A[=/2 1 3|=-—I5. 
4 1 6 


It is readily verified from Eqn (9-18) that interchanging any two rows or 
columns of | A | changes its sign. Thus we have 
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La 3 1 34 
21 4|Ξ-- .] 2 41), 
9 4 —6 9 -6 4 


in which the determinant on the left has been obtained from the one on the 
right by interchanging the second and third columns. 

A particularly simple case arises when | A | is the determinant associated 
with a diagonal matrix A, for then all off-diagonal elements are automatically 


zero. This impligs that Eqn (9-18) reduces to | A | = a11a@22a33, which is just 
the product of the elements of the principal diagonal. Thus if 
3 000 
|A;=/0 —2 0], 
0 0 4 


then | A| = (3)(—2)(4) = —24. 


Another useful result is that the value of a determinant is unchanged when 
elements of a row (or column) have added to them some multiple of the 
corresponding elements of some other row (or column). We prove this result 
by direct expansion in the following typical case. Consider the determinant 
| D | obtained from | A | by adding to the elements of column 3 of | A], A 
times the corresponding elements in column 2 of | A | to obtain: 


a1 αι 413 + Aaje 
|D| =| ae. azz azz + Aage }. 
a31 432 a33 + Aage 


Then at once Definition 9-8 asserts that 


ag2 dog 552 Ad22 a21 23 
[Ὁ [-Ξ αι + ai — are 
a32 (433 aso Aagze 81 433 
921 Adee a21 a22 a21 (22 
— a2 + δι + hai 
a31 Adse 81 (435 81 (385 


Now the second term on the right-hand side is zero, whilst the fourth and 
last terms cancel leaving only three remaining terms. These are seen to com- 
prise the definition of | A |, so that we have proved that | D | = | A | or, in 
symbols, that 


411 a2 a3 + Aaj Mii aig aig 
91] Q22 (98 + Aden | =| det deo (58 |. 


431 432 a33 + Aage a31 a32 433 
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A similar result would have been obtained had different columns been used 
or, indeed, had rows been used instead of columns. 

An obvious implication of this result is that if a row (or column) of a 
determinant is expressible as the sum of multiples of other rows (or columns) 
of the determinant, then the value of the determinant must be zero. This is so 
because by subtraction of this sum of multiples of other rows (or columns) 
from the row (or column) in question, it is possible to produce a row (or 
column) containing only zero elements. 

Let us illustrate how a determinant may be simplified by means of this 
result. Consider the determinant 


7 18 8 
ΙΑ[ΞῚ 1 5 7}. 
3 9 4 
Subtracting twice the third row from the first row we find 
1 0 0 
Ὰ ΞΞ]1 5 71, 
3 9 4 
whence | A| = —43. 


Let us summarize our findings in the form of a theorem. 


THEOREM 9-4 (properties of determinants) 


(a) A determinant in which all the elements of a row or column are zero, 
itself has the value zero; 

(b) A determinant in which all corresponding elements in two rows (or 
columns) are equal has the value zero; 

(c) If the elements of a row (or column) of a determinant are multiplied 
by a factor A, then the value of the determinant is multiplied by 4; 

(d) The value of a determinant associated with a diagonal matrix is 
equal to the product of the elements on the principal diagonal; 

(e) The value of a determinant is unaltered by adding to the elements of 
any row (or column), a constant multiple of the corresponding elements 
of any other row (or column); 

(f) If a row (or column) of a determinant is expressible as the sum of 
multiples of other rows (or columns) of the determinant, then its value is 
zero. 


Higher order determinants can be defined with exactly similar properties 
to those enumerated in the theorem above. Thus the determinant | A | of 
order n associated with the square matrix A of order (n x n) has n! terms in 
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its expansion, each of which contains one, and only one, element from each 
row and column of A. 


DEFINITION 9-9 (fourth order determinant) If A is the square matrix of 
order (4 x 4) 


411 412 a3 14 

G21 22 23 Gea 

Q31 32 433 434 

441 G42 048 G44 

then the expression 

411 412 @13 14 
| A| = is Q22 a23 (94 
Q31 432 633 434 
@4t Gaz 443 44 


is called the fourth order determinant associated with the square matrix A, 
and it is defined to be the number 


a22 a23 G24 d21 a23 424 
| A | = ai1| a32 a33 a34 | — ai2| 431 (88 asa 
442 443 444 @a1 643 (044 

Ω21 422 624 421 422 423 

+ ais} 431 @32 384 | — {Π14] G31 a32 ag 

441 642 a44 ὯΔ 442 (43 


An inductive argument applied to Definitions 9-8 and 9-9 shows one way 
in which higher order determinants may be defined, but clearly our notation 
needs some simplification to avoid unwieldy expressions of the type given 
above. This is achieved by the introduction of the minor and the cofactor of 
an element of a square matrix. 


DEFINITION 9-10 (minors and cofactors) Let A be a square matrix of 
order (” X n) with general element aj, and let | A | be the determinant of 
order n associated with A. Denote by Mi; the determinant of order (n — 1) 
associated with the matrix of order (7 — 1, — 1) derived from A by the 
deletion of row i and column j. Then Mj; is called the minor of the element 
ay of A, and Ay = (—1)*7My 15 called the cofactor of the element aj; of A. 


Example 99 Find the minors and cofactors of the matrix 


SEC 9:3 DETERMINANTS / 401 


1 0 3 
A= [2 1 4 
12 1 


Solution The minor M11 is derived from A by deleting row 1 and column 1 
and equating Mj, to the determinant formed by the remaining elements. 
That is, 


Ι 4 
] 


Miu = = —7, 


Similarly, minor Mj2 15 derived from A by deleting row 1 and column 2 and 
equating Mj» to the determinant formed by the remaining elements. That is, 


2 4 
Mio = == 2, 
Ι 
Identical reasoning then shows that Mi3 = 3, Moi = —6, Moo = —2, 


Mo3 = 2, Msi = —3, Ms2 = —2, and M33 = 1. As the cofactors Ay = 
(—1)*4 My, it follows that 411 = —7, 4115 = 2, Aig = 3, 421 = 6, 42. = —2, 
Ao3 = —2, “431 = — 3, A392 = 2; and A33 = |, 

If A is a square matrix with general element aj; and corresponding co- 
factor Az, it is easily seen that: 


(a) if Ais of order (2 x 2), then | A | = a11A11 + ai2Ai2, 
(Ὁ) if A is of order (3 x 3), then | A | = 411A11 + 412412 + α134413, 
(c) if A is of order (4 x 4), then | A} = a11A11 + a12A12 + 13413 + 


α144414. 


This suggests that if A is of order (n x n), then for [Α | we could adopt the 
definition 


| A | = a11Ai1 + ai2Ai2 ++ + + + ainAin. (9-19) 


This is a true statement and could be accepted as a definition, but it is 
not the most general one which may be adopted. To see this we return to 
Egn (9:18) and re-arrange the terms on the right-hand side to give 


| A | = a@3i(a12a23 — (18,229) — a3e(a11423 — a13421) | 
+ a33(@11@22 — @12021). 
Hence, working backwards, we have 
412 413 @11 413 a1 aie 


| A | = a31 — a32 + ass ; 
422 423 421 423 421 22 
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thereby showing that it is also true that 
| A | = a31A31 + a32A32 + 433A33. (9-20) 


We now have two equivalent but different looking expressions for | A | 
either of which could be taken as the definition of | A |. The expression in 
(b) above involves the elements and cofactors of the first row of A and the 
expression in Eqn (9-20) involves the elements and cofactors of the third row 
of A. A repetition of this argument involving other rearrangements of the 
terms of Eqn (9-18) shows that | A| may be evaluated as the sum of the 
products of the elements and their cofactors of any row or column of A. This 
very valuable and general result is known as the Laplace expansion theorem, 
and it is true for determinants of any order though we have only proved it 
for a third order determinant. Let us state this result formally as it would 
apply to a determinant of order n. 


THEOREM 9:5 (Laplace expansion theorem) The determinant | A | associated 
with any ( x n) square matrix A is obtained by summing the products of 
the elements and their cofactors in any row or column of A. If A has the 
general element a;; and the corresponding cofactor is Ay, then this result is 
equivalent to: | 


Expansion by elements of a row 
|A| => ayAy 
j=1 


for i= 1,2,. 5.573 


Expansion by elements of a column 
JA] = > ayAy 
i=1 
for j= 1,2,. ..., ἡ. 


Example 910 Evaluate the determinant 


i 42 
\A|=|3 —2 1 
1 52 


by expanding it (a) in terms of the elements of row 2, and (b) in terms of the 
elements of column 3. 
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Solution 
4 2 ᾿-- i 4 
(a) |A| = --3 —2 — 1 == 5 
5 2 1. 22 ] 5 
3 —2 Ι 4 1 4 
(b) |A]|= 2 | — | +2 = 5. 
1 5 1 § 3 —2 


An important extension of Theorem 9-5 asserts that the sum of the 
products of the elements of any row (or column) of a square matrix A with 
the cofactors corresponding to the elements of a different row (or column) is 
zero. This is easily proved as follows. 

Let A be a matrix of order (n X n), and let B be obtained from A by re- 
placing row 4 of A by row p. Then B has the elements of rows p and qg equal, 
so that by Theorem 9-4(b) it follows that |B] = 0. Expanding | B| in 
terms of elements of row g by Theorem 9:5 we then find 


| B | = Api Agi τ Ap2Agq2 = ApnAagn = 0, 
which was to be proved. A similar argument establishes the corresponding 


result for columns and so we have proved our assertion. 


THEOREM 9:6 The sum of the products of the elements of any row (or 
column) of a square matrix A with the cofactors corresponding to the ele- 
ments of a different row (or column) is zero. Symbolically, if ai; is the general 
element of A and Aj is 115 cofactor, then: 


Expansion by elements of a row 


a 


ime 


aApiAdgi = 0 
1 


if p #q; and 
Expansion by elements of a column 
2 dipAiqg = 0 
if p + q. 
Example 9-11 Verify that the sum of the products of the elements of 


column 1 and the corresponding cofactors of column 2 of the following 
matrix is Zero: 
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1 3 2 
A= |4 1 2 
3 | 3 


Solution The elements of column 1 are a1; = 1, ao, = 4, a3; = 3. The 
cofactors corresponding to the elements of the second column are 415 = —6, 
Ago = —3, Ase = 6. Hence 


@11A19 + @21A22 + a31As2 = (1)(—6) + (4)(—3) + (3)(6) = 0. 


9.4 Linear dependence and linear independence 


We are now in a position to discuss the important idea of linear independence. 
This concept has already been used implicitly in Chapter 4 when the three 
mutually orthogonal unit vectors i, j, and k were introduced comprising what 
in linear algebra 15 called a basis for the vector space. By this we mean that 
all other vectors are expressible in terms of the vectors comprising the basis 
through the operations of scaling and vector addition, but that no member 
of the basis itself is expressible in terms of the other members of the basis. 
Thus no choice of the scalars A, w can ever make the vectors i and Aj + uk 
equal. It is in this sense that the unit vectors i, j, k comprising the basis for 
ordinary vector analysis are linearly independent, and obviously any other 
set of unit vectors a, b, e which are not co-planar, and no two of which are 
parallel, would serve equally well as a basis for this space. 

The same idea carries across to matrices when the term vector is inter- 
preted to mean either a matrix row vector or a matrix column vector. Thus 
the three column vectors 


1 2 5 
Ci = 3 ; Co = 1/1 5 and C3 = 15 
a) 4 6 


are not linearly independent because (8 = C, + 20», whereas the three row 
vectors 


Ri={l 0 Oj, Re = [0 1 Oj, and Rs = [0 0 1] 


are obviously linearly independent, because no choice of the scalars A, w can 
ever make the vectors Ri and AR2 + μ equal. It is these ideas that underlie 
the formulation of the following definition. 


DEFINITION 9:11 (linear dependence and linear independence) The set of 

n matrix row or column vectors Vi, Vz, . . ., Vx which are conformable for 

addition will be said to be linearly dependent if there exist n scalars a1, a, 
. «5 &n, not all zero, such that 
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a1Vi + αὐνὸ ἘΠ °° + αν = 0. 


When no such set of scalars exists, so that this relationship is only true 
when a1 = ἄς =-°*:: = a, = 0, then the nm matrix vectors Vi, V2,.. ., Vn 
will be said to be /inearly independent. | 


In the event that the m matrix vectors in Definition 9-11 represent the 
rows or columns of a rectangular matrix A, the linear dependence or inde- 
pendence of the vectors Vi, Vo, . . ., Vx becomes a statement about the 
linear dependence or independence of the rows or columns of A. In particular, 
if A is a square matrix, and linear dependence exists between its rows (or 
columns), then by definition it is possible to express at least one row (or 
column) of A as the sum of multiples of the other rows (or columns). Thus 
from Theorem 9:4 ([), we see that linear dependence amongst the rows 
or columns of a square matrix A implies the condition | A | = 0. Similarly, 
if | A| τ 0 then the rows and columns of A cannot be linearly dependent. 


THEOREM 9-7 (test for linear independence) The rows and columns of a 
square matrix A are linearly independent if, and only if, | A | τὲ 0. Conversely, 
linear dependence is implied between rows or columns of a square matrix 
Aif|A|=0. 


Example 9:12 Test the following matrices for linear independence between 
rows or columns: 


1 4 3 1 1 0 
Α-- [- 18 7 and B=/3 2 1}. 
4 -6 1 1 1 3 
Solution We shall apply Theorem 9-7 by examining | A | and | B |. A simple 
calculation shows that | A| = 0, so that linear dependence exists between 
either the rows or the columns of A. In fact, denoting the columns of A by 
Ci, Ce, and C3, we have Cz = 2(C3 — C;). As | B| = —3 the rows and 


columns of B are linearly independent. 


Let us now give consideration to any linear independence that may exist 
between the rows or columns of a rectangular matrix A of order (m x n). If 
r rows (or columns) of A are linearly independent, where r < min (m, n), 
then Theorem 9-7 implies that there is at least one determinant of order r 
that may be formed by taking these r rows (or columns) which is non-zero, 
but that all determinants of order greater than r must of necessity vanish. 
This number r is called the rank of the matrix A, and it represents the greatest 
number of linearly independent rows or columns existing in A. If, for example, 
A is a square matrix of order (n x.n) and | A| 40, this implies that the 
rank of A must be n. 
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DEFINITION 9:12 (rank of a matrix) The rank r of a matrix A is the greatest 
number of linearly independent rows or columns that exist in the matrix A. 
Numerically, r is equal to the order of the largest order non-vanishing deter- 
minant | B | associated with any square matrix B which can be constructed 
from A by combination of r rows and r columns. 


Example 9:13 Find the rank of the following matrix: 
1 0 0 1 0 
A=/-I1 1 1 —I1 Ij. 
—3 0 1 —-1 0 


Solution The largest order of determinant that can be constructed in this 
case from the rows and columns of A is 3. As there is certainly one such 
determinant that is non-vanishing, namely the one associated with the first 
three columns of A, the rank of A must be 3. The fact that other non-vanishing 
determinants of order three may be constructed from A is immaterial (e.g., 
take the last three columns). 


9.5 Inverse and adjoint matrix 


The operation of division is not defined for matrices, but a multiplicative 
inverse matrix denoted by A-} can be defined for any square matrix A for 
which | A | 4 0. This multiplicative inverse A~! is unique and has the pro- 
perty that 


AA = AAT = 


where I is the unit matrix, and it is defined in terms of what is called the 
matrix adjoint to A. The uniqueness follows from the fact that if B and C are 
each inverse to A, then B(AC) = (BA)C, so that BI = IC, or B = C. 


DEFINITION 9-13 (adjoint matrix) Let A be a square matrix, then the 
transpose of the matrix of cofactors of A is called the matrix adjoint to A, 
and it is denoted by adj A. A square matrix and its adjoint are both of the 
same order. 


Example 9514 Find the matrix adjoint to: 


1 2 1 
A=]3 1 O}. 
2 1 2 
Solution The cofactors Ay of A are: 411 = 2, 412 = —6, 4183 = 1, 421 


= —3, 4.259 = 0, 4253 = 3, Agi = —1, 4.35 = 3, and 4,35 = —5. Hence the 
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matrix of cofactors has the form 


2 —6 ] 
—3 [0 3), 
ἱππὶ 3 —5 
so that its transpose, which by definition is adj A, is 
2 -,3 - 
adgjA=/-6 0. 4831. 
Ϊ 3. - 


Now from Theorems 9.5 and 9:6, we see that the effect of forming either 
the product (adj A)A or the product A(adj A) is to produce a diagonal 
matrix in which each element of the leading diagonal is | A |. That is, we 
have shown that 


[AJ] O oO... 0 
(adj AJA = A(adj A) = Ὡς er ee eG : 
0 0 0... IA 
whence 
(adj AJA = A(adj A) = | A[TI. (9-21) 


Thus, provided | A | + 0, by writing 
_ adj A 


ΑΓΔ = 3 (9-22 
ΓΑῚ ᾿ 

we arrive at the result 
AA = AA! =I, (9:23) 


The matrix A~ is called the matrix inverse to A and it is only defined for 
square matrices A for which | A| 40. A square matrix whose associated 
determinant is non-vanishing is called a non-singular matrix. Although the 
inverse matrix is only defined for non-singular square matrices, the adjoint 
matrix is defined for any square matrix, irrespective of whether or not it is 
non-singular. 


DEFINITION 9-14 (inverse matrix) If A is a square matrix for which 
| A | 0, the matrix inverse to A which is denoted by Α“1 is defined by the 
relationship 
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_ adj A 


A-1 
| A} 


Example 9.15 Find the matrix inverse to the matrix A of Example 9-14 
above. 


Solution It is easily found from the cofactors already computed that 
| A| = —9. This follows, for example, by expanding | A | in terms of ele- 
ments of the first row to obtain | A| = (1)(2) + (2)(—6) + (1)(1) = —9. 
Hence from Definition 9:14, we have 


ee ae —2/9 1/3 1/9 
at = SS = (-1/9) —6 0 3/—/ 23 oO —i1p/I. 
i: ὃ. a5 -1/9 —1/)3 5/9 


The steps in the determination of an inverse matrix are perhaps best 
remembered in the form of a rule. 


Rule 2 (Determination of inverse matrix) 


To determine the matrix A~! which is inverse to the square matrix A proceed 
as follows: 


(a) Construct the matrix of cofactors of A; 

(b) Transpose the matrix of cofactors of A to obtain adj A; 

(c) Calculate | A | and, if it is not zero, divide adj A by | A| to obtain 
Al; 

(4) If | A| = 0, then A“! is not defined. 


It is a trivial consequence of Definition 9-14 and the fact that for any 
Square matrix A, | A| = | A’ | (see Problem 9-34), that 
(A“t)’ = (A). (9-24) 
Also, if A and B are non-singular matrices of the same order, then 
(B-1A-1)AB =I = AB(B-14-}), 
showing that 
(ΑΒ): = B-1A71, (9-25) 


Accepting the result of Problem 9-35 as being valid for square matrices 
A,.B of arbitrary order (n x n), so that | AB| = | A || B|, we are able to 
prove another useful result concerning the inverse matrix. If | A | + 0, then 
AA-! =I showing that | AA-! | = 1, or | A || Av? | = 1. It follows from 
this that: 
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JA, =1/[A4], (9-26) 


One final result follows directly from the obvious fact that (A~1)-tA7! 
= I, which is always true provided | A~! | + 0. If we post-multiply this result 
by A we find | 


(A-)-1A-1A = TA 


giving 
(At) = A, 
whence 
(Α-Ὁ.-: = A, | (9-27) 


THEOREM 98 (properties of inverse matrix) If A and B are non-singular 
square matrices of the same order, then: 


(a) AAt=ATIA =]; 
(b) (AB)? = BUA; 
(0) (Α- = (A) ?; 
(d) (AT) b= A; 

(e) |A| = 1/| A]. 


Example 9.16 Verify that (A~!)’ = (A’)~1, given that 
1 3 
A= 
2 
Solution We have 
= Bp 
Al = , 
1 —1/2 
so that 


‘stele ἃ 
y=|5, | 


However, 
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so that 


Ee 1 
(Αγ = | 
9/2) 222 7 


confirming that (A71)’ = (A’)"}. 


9-6 Matrix functions of a single variable 


All the matrix results that have been obtained so far are equally valid whether 
applied to matrices whose elements are numerical constants, or to matrices 
whose elements are functions of a single variable ¢. When the latter is the 
case it is convenient to copy the notation for a function used hitherto, and 
to represent the matrix by writing A(t). In many respects it is convenient to 
regard all matrices in this manner, since matrices with constant number 
elements correspond to the subset of all possible matrices A(t) in which all 
elements are constant functions. 

When the elements of A(?) are all differentiable with respect to ¢ in some 
interval, it is reasonable to define a derivative of A(t) with respect to ἢ, and 
for this purpose we shall work with the following definition. 


DEFINITION 9-15 (derivative of a matrix) Let A(t) be a matrix of order 
(m X n) whose elements a;;(t) are all differentiable functions of t in some 
common interval to < ¢ < ἢ. Then the derivative of A(¢) with respect to fin 
to << t < t1, written dA/d?, is defined to be the matrix of order (m x n) with 
elements dajj/dt. The matrix A(¢) will be said to be differentiable in to < t 
« ἢ. Symbolically this result becomes: 


dai; daje dain 
aun(t) aio(t) ... ain(t) a. ae ee ae 
daz, daze daen | 
d ον Ss oe . @ e@ 
dt 
dam1 dame damn 


Ami(t) Am2(t) S Saha Amn(t) ar di as a 


for to << t < ἢ. 


Example 9:17 Find dA/d¢ given that: 


cosh¢ sin t cosh 2, 
A(t) = 


sinh t cost sinh 2. ! 
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Solution From Definition 9:15 we have at once: 


dA ae t cost 2 sinh ἢ 


cosht -—sin t 2cosh2t 


for all ἢ. 


If aij(t) and bi(t) are differentiable functions in some common interval 
to < t < ἢ, then we know from the work of Chapter 5 that 


d 
Fs (ai1b1; + aizbaj3 + °° + + + Ginbnj) = - 


dai daie dain 
= (Geb Geert TF by 
dbi; dbaj =) 
7 [κι F + ai2 qd + + din al 


Consequently, it then follows directly from Definitions 9-3 to 9-6 that for 
suitably conformable matrices A and B: 


d dA dB. 
gE at τ’ ae 
d dA 
— (AA) = A —., for any constant scalar 4; (9-29) 
dt dt 
and 
d dA dB 
= -- —-B τς : 
dt {18} dt ΤῊ dt Cm) 


Notice that in general dA?/dt + 2A(dA/d)), for setting B = A in Eqn (9-30) 
yields 


dA? dA dA 


It also follows that if K is a constant matrix in the sense that its elements are 
constant functions of ¢, then 


dK 
--- ΞΞ 0. . 
dy (9-32) 


412 / LINEAR TRANSFORMATIONS AND MATRICES CH 9 


Using the results of Theorems 9-3 (d) and 9-8 (b) together with Eqn (9-30), 
we can derive two useful results. The first result applies to any two matrices 
A, B which are conformable for multiplication and is 

dB’ dA’. 


d 
τ (AB) = 5 B’A’) = — A’ ὑπο 
τ ) = ( ) rr + B’ ray (9-33) 


the second result applies to any two non-singular square matrices which are 

conformable for multiplication and is 

dA“ 
dt 


d d dB! 
— (AB)-! = — (B-1A-1) = —— Al 4 B-! 9.34 
vor ae y= a : ey) 


We now.summarize these results in the form of a general theorem. 


THEOREM 999 (properties of matrix differentiation) Let A(t) and B(t) be 
suitably conformable matrices which are differentiable in some common 
interval [0 < ¢t < ἢ, and let K be a constant matrix and A a scalar. Then 
— the interval to <t « fi: 


dA dB 
(a) τ ΤᾺ +B)= τ Ἐ 
dt 
dA dB. 
δ) —(A — B) = — -- — 
( g τί ) dt dt’ 
d dA 
= = —s A 
(c) Ἵ; (AA) =A a (A a constant scalar) 
a B 
(ὦ - — (AB) = a” +A = 
dr’ 
dK 7 
(e) 7 0; (K a constant matrix) 
dB’ dA’. 
AB)’ = — A’ — 
(f) - = ( Ξε 
dB" dA-1 


ΑΒ): = —— A-! + B-1 
(8) + x ) Ἴ; + aa 


where A and B are non-singular matrices. 


Example 9-18 Verify Eqn (9-33) for the matrices 


t | 2 12 
A(t) = " ἡ and B(t) = ᾿ |} 
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Solution We have 


24+P 147 
B = : 
15-- 2 0 


so that 
f+e -—2 
(AB) = ; 
1+ 0 
and thus 
Cis 2+ 3r2 51 
a 372 Oo | 
Now, 
—| 2 £3 
A‘(t) = and Β΄ () = 
[2 
so that 


dA’ 1 0 dB’ 0 31:3 
— = and —_ = . 
dt 0 2r dt 2 0 


Using these results we have 


dB’ dA’ 0 3f\|[t —-1 2 f)/1l O 
—-A’ + B’— = + 
dt dt 21 01 PF rm 1i{O 2: 
312 314 2 2r 
= + 
212 —2t 2. 721 


2 - 312. 51 d 
7 313 o| dt 5: 


§:7 Solution of systems of linear equations 


A system of m linear inhomogeneous equations in the n variables τι, Χο, 
. .» Xn has the general form 


411X1 + ayoxX2 ++ + + ainXn = k1 


Q21X1 + deoxe2 + + + + Gonxn = ke (9-35) 


Ami. + Am2X2 4: + + + AmnXn = km, 
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where the term inhomogeneous refers to the fact that not all of the numbers 


Κι. k2,. . .. km are zero. Defining the matrices 
δ k 
411 419 ~ 6s Ain : : 
a a a δ 2 
21 22 + © 6 on 
A= | Ἔ P x= 5 and K = ἢ 
Gm1 απ ... Amn ᾿ . 
Vn Κη 
this system can be written 
AX = K. (9-36) 


Here A is called the coefficient matrix, X the solution vector, and K the 
inhomogeneous vector. 

In the event that m = n and | A| +0 it follows that A-? exists, so that 
pre-multiplication of Eqn (9:36) by A~! gives for the solution vector, 


X = AIK, (9-37) 


This method of solution is of more theoretical than practical interest because 
the task of computing A~! becomes prohibitive when n is much greater than 
three. However, one useful method of solution for small systems of such 
equations (nm < 4) known as Cramer’s rule may be deduced from Eqn (9-37). 

Consideration of Eqn (9-37) and Definitions 9-14 shows that χε, the ith 
element in the solution vector X, is given by 


| | 
‘= (ki Avi + koAni Ἔ τ" + + kn Ani) (9-38) 


~ 


fori = 1,2,.. .,m, where Aj 1s the cofactor of A corresponding to element 
aij. Using Laplace’s expansion theorem we then see that the numerator of 
Egn (9-38) is simply the expansion of | A; |, where A; denotes the matrix 
derived from A by replacing the ith column of A by the column vector K. 
Thus we have derived the simple result 
gant) 
t A | 
for. T= 1,2,... et, (9-39) 


which expresses the elements of the solution vector X of Eqn (9-35) in terms 
of determinants. 


Rule 3 (Cramer’s rule) 
To solve ἡ linear inhomogeneous equations in ἢ variables proceed as follows: 


(a) Compute | A| the determinant of the coefficient matrix and, if 
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Cae 14 ag 
| A | +0, proceed to the next step; 
(Ὁ) Compute the modified coefficient determinants | A; |,i= 1, 2,..., 
n where Αἱ is derived from A by replacing the ith column of A by the 
inhomogeneous vector Και; 


(c) Then the solutions x1, 2,.. ., Vn are given by 
os 
ἘΝῚ ἘΞ A | 
[ΟΓ 2 2c eg 


(4) If | A| = 0 the method fails. 


Example 9:19 Use Cramer’s rule to solve the equations: 
ΑἹ + 3xa + x3 = 8 
2N1 + Xe + 3xg = 7 
Nit X2—- χα = 2. 


Solution The coefficient matrix A and the modified coefficient matrices 
Ai, Ag, and Ag are obviously: 


1 3 ] ὃ 3 i 1 8 I 
A=/2 1 3], Arwz/]7 1 3], Ae=/2 7 34], and 
ι | [2 1 ὦ C2: i 
1 3 8 
Ag = [2 7 
es oe 
Hence | A| = 12, | Ai | = 12, | Ao | = 24, and | As| = 12, 580 that 
A A A 
mete, τ τ ET - 2. me Ot, 
In the more general case in which m =n, but | A| = 0, the inverse 


matrix does not exist and so any method using A-! must fail. In these cir- 
cumstances we must consider more carefully what is meant by a solution. In 
general, when a solution vector X exists whose elements simultaneously 
satisfy all the equations in the system, the equations will be said to be con- 
sistent. If no solution vector exists having this property then the equations 
will be said to be inconsistent. Consider the following equations: 


Xi + Xa+ 2x3 = 9 
4x, —2x2+ x3=4 
5X1 το xo + 3x3 = 1. 
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These equations are obviously inconsistent, because the left-hand side of the 
third equation is just the sum of the left-hand sides of the first two equations, 
whereas the right-hand sides are not so related (that is, 1 A 9 + 4). In effect, 
what we are saying is that there is a linear dependence between the rows of 
the left-hand side of the equations which 15 not shared by the inhomogeneous 
terms. The row linear dependence in the coefficient matrix A is obviously 
dependent upon the rank of A and we now offer a brief discussion of one way 
in which the general problem of consistency may be approached. 


Obviously, when working conventionally with the individual equations 
comprising (9:35) we know that: (a) equations may be scaled, (Ὁ) equations 
may be interchanged, and (c) multiples of one equation may be added to 
another. This implies that if we consider the coefficient matrix A of the system 
and supplement it on the right by the elements of the inhomogeneous vector 
K to form what is called the augmented matrix, then these same operations 
are valid for the rows of the augmented matrix. Clearly, the rank will not be 
affected by these operations. If the ranks of A and of the augmented matrix 
denoted by (A, K) are the same, then the equations must be consistent; 
otherwise they must be inconsistent. 


DEFINITION 9:16 (augmented matrix and elementary row operations) 
Suppose that AX = K, where 


αι aig ... Gin ΧῚ Κι 

do, [252.ϑ .... Aan x2 ko 
A= 2 ΟΧ ἸΞΞ , and K= 

Qni Qn2 «..-.+ Ann Xn kn 


Then the augmented matrix, written (A, K), is defined to be the matrix 


ai, aig... Aim ky 

Q21 G22 ... aan ke 
(A, K) = 

Qni An2 ... Qnn .Kn 


An elementary row operation performed on an augmented matrix is any 
one of the following: 


(a) scaling of all elements in a row by a factor /; 
(b) interchange of any two rows; 
(c) addition of a multiple of one row to another row. 


An augmented matrix will be said to have been reduced to echelon form by 
elementary row operations when the first non-zero element in any row 15 a 
unity, and it lies to the right of the unity in the row above. 
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Example 9-20 Perform elementary row operations on the augmented 
matrix corresponding to the inconsistent equations above to reduce them to 
echelon form. Find the ranks of A and (A, K), 


Solution The augmented matrix 


1 129 
(A Ry Stl τὸ ἃ. ἃ]; 
a ee 


Subtract from the elements of row 3 the sum of the corresponding elements 
in rows 1 and 2 to obtain 


J 1 2 9 
4 —2 1 4). 
0 0 0 —12 


Subtract from the elements of row 2, four times the corresponding elements 
in row | to obtain 


1 Ϊ 2 9 


0 0 0 —12 


Divide row 2 by —6 and row 3 by —12 to obtain 


11 2. 9 
0 1 7/6 16/31. 
000 1 


This is now in echelon form and the rank of the matrix comprising the first 
three columns is 2, which must be the same as the rank of the coefficient 
matrix A. The rank of (A, K) must be the same as the rank of the echelon 
equivalent of the augmented matrix which is clearly 3. 

The general conclusion that may be reached from the echelon form of an 
augmented matrix (A, K), is that equations are consistent only when the ranks 
of A and (A, K) are the same. If the equations are consistent, and A is of 
order (πη X n) and the rank r < n, we shall have fewer equations than vari- 
ables. In these circumstances we may solve for any r of the variables x; in 
terms of the n — r remaining ones which can then be assigned arbitrary 
values. 


THEOREM 9:10 (solution of inhomogeneous systems) The inhomogeneous 
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system of equations 
AX = K, 


where A is of order (n x n) and X, K are of order (πα Χ 1) has a unique solu- 
tion if | A | 0. If | A | = 0, then the equations are only consistent when the 
_ ranks of A and (A, K) are equal. In this case, if the rank r < ἡ, it is possible to 
solve for r variables in terms of the n — r remaining variables which may then 
be assigned arbitrary values. 


Example 9:21 Solve the following equations by reducing the augmented 
matrix to echelon form: 


Xit3xe—- x3= 6 
8x1 + 9x0 + 4x3 = 21 
2X1 + xo + 2x3 = 3. 


Solution The augmented matrix 


13-1 6 
(Α, Κ) -- 8 9 4 21). 
3,1; 2 3 


Subtract from the elements of row 2 the sum of three times the corres- 
ponding element in row 3 and twice the corresponding element in row 1 to 
obtain 


1 3 —1 6 
00 00 O}. 
2 1 2.1.5 
Interchange rows two and three to obtain 
1 3 -- 6 
2.2] 2. Sls 
00 O00 


Subtract twice row | from row 2 and divide the resulting row 2 by —5 to 
obtain 


ι.3 πὶ 6 
0 1 —4/5 9/5}. 
00 0 0 


This is now in echelon form and clearly the ranks of A and (A, K) are both 2 
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showing that the equations are consistent. However, only two equations 
exist between the three variables +1, xe, and x3, for the echelon form of the 
augmented matrix may be seen to be equivalent to the two scalar equations 


x1 + 3x2 — x3 = 6 and xg — -- Χα = 


Hence, assigning x3 arbitrarily, we find that 


ce | 9 4 

Mi go ae and Mga ee ea 

When the inhomogeneous vector K = 0, the resulting system of equations 
AX = 0 is said to be homogeneous. Consider the case of a homogeneous 
system of ἡ equations involving the ἡ variables x1, x2, . . .. Xn. Then it is 
obvious that a trivial solution x1 = x2 =+ + " = χῃ = Ὁ corresponding to 
X = 0 always exists, but a non-trivial solution, in the sense that not all 
X1, %2,. . -, Xn are zero, can only occur if | A | = 0. To see this notice that if 
| A | #0 then A exists, so that premultiplication of AX = 0 by A-! gives 
at once the trivial solution X = 0 as being the only possible solution. 
Conversely, if | A | = 0, then certainly at least one row of A is linearly 
dependent upon the other rows, showing that not all of the variables xj, 
X2,.. ., Xn can be zero. 

When a non-trivial solution exists to a homogeneous system of n equa- 
tions involving 7 variables it cannot be unique, for if X is a solution vector, 
then so also is AX, where A is a scalar. As in our previous discussion, if the 
rank of A which is of order (n x n) is r, then we may solve for r of the vari- 
ables x1, X2,. . ., Χῃ in terms of the n — r remaining ones which can then be 
assigned arbitrary values. | 


THEOREM 9:11 (solution of homogeneous systems) The homogeneous 
system of equations 


AX = 0, 


where A is of order (nm X n) and X, 0 are of order (n x 1) always has the 
trivial solution X = 0. It has a non-trivial solution only when | A | = 0. If 
A is of rank r < ἡ, it is possible to solve for r variables in terms of the n — r 
remaining variables which may then be assigned arbitrary values. If X is a 
non-trivial solution, so also is AX, where 1 is an arbitrary scalar. 


Example 9:22 Solve the equations 


X1— X2+ x3 =0 
2X1 + xe— x3 =0 
X1 + ὅχο — 5x3 = 0. 
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Solution There is the trivial solution x; = χρῷ = x3 = 0 and, since the 
determinant associated with the coefficient matrix vanishes, there are also 
non-trivial solutions. The augmented matrix is now 


it, 1: ὦ 
(Α,0)-- [2 1 —1 O|, 
1 5 —5 0 


which is easily reduced by elementary row transformations to the echelon 
form 


1 —! 1 0 
0 I —l 0O 
0 0 0 0 


This shows that there are only two equations between the three variables 
x1, X2, and x3, for the echelon form of the augmented matrix is seen to be 
equivalent to the two scalar equations 


ΧῚ — Xo + x3 =0 and xe - x3 = 0. 


Hence, assigning x3 arbitrarily, we have for our solution x1 = 0 and x2 = 
x3 = k (say). 

A practical numerical method of solution called Gaussian elimination 1s 
usually used when dealing with inhomogeneous systems of n equations 
involving 7 variables. This is essentially the same method as the one described 
above for the reduction of an augmented matrix to echelon form. The only 
difference is that it is not necessary to make the first non-zero element 
appearing in any row in the position corresponding to the leading diagonal 
equal to unity. We illustrate the method by example. 


Example 9-23 Solve the following equations by Gaussian elimination: 
x1 —- χα --ὶ χα =0 
3x1 + xg + 2x3 = 6 
2x1 + 2x2 + x3 = 2. 


Solution The augmented matrix 


1 —1 -1 0 
(A, K) = [3 1 2 6). 
2 2 12 


Subtracting three times row | from row 2 and twice row 1 from row 3 gives 
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Ι πὶ —! 0 
O: 4. Χ55 6!. 
0 4 3 2 
Subtraction of row 2 from row 3 gives 
1} -  - 0 
0 4 5 6}. 
0 0 -- —4 


The solution is now found by the process of ‘back-substitution’ using the 
scalar equations corresponding to this modified augmented matrix. That is, 
the equations 


NX} πο Xe Δ = 
4vo+ 5x3 = 6 
= 2X3 = —4. 
The last equation gives x3 = 2 and, using this result in the second then gives 
x2 = —1. Combination of these results in the first equation then gives 
y= Ι: 


It is not proposed to offer more than a few general remarks about the 
solutions of m equations involving ἢ variables. If the equations are con- 
sistent, but there are more equations than variables so that m > a, it 1s clear 
that there must be linear dependence between the equations. In the case that 
the rank of the coefficient matrix is equal to there will obviously be a 
unique solution for, despite appearances, there will be only ἡ linearly inde- 
pendent equations involving n variables. If, however, the rank is less than ἢ 
we are in the situation of solving for r variables x1, x2,. . ., in terms of the 
remaining n — r variables whose values may be assigned arbitrarily. In the 
remaining case where there are fewer equations than variables we have 
m <n, When this system ts consistent it follows that at least n — m variables 
must be assigned arbitrary values. 


9.8 Eigenvalues and eigenvectors 
Let us examine the consequence of requiring that in the system 
AX = K, (9-40) 


where A is of order (n Χ n) and X, K are of order (m x 1), the vector K is 
proportional to the vector X itself. That is, we are requiring that K = AX, 
where A is some scalar multiplier as yet unknown. This requires us to solve 
the system 


AX = XX, (9-41) 
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which is equivalent to the homogeneous system 

(A — ADX = 0, (9-42) 
where I is the unit matrix. 

Now we know from Theorem 9-11 that Eqn (9:42) can only have a non- 


trivial solution when the determinant associated with the coefficient matrix 
vanishes, so that we must have 3 


When expanded, this determinant gives rise to an algebraic equation of 
degree n in A of the form 


A® + a AP-1 + adm 24+ --+-4+ a, = 0. (9-44) 


The determinant (9-43) is called the characteristic determinant associated with 
A and Eqn (9-44) is called the characteristic equation. It has ἢ roots Ai, Ae, 
. . «; An, each of which is called either an eigenvalue, a characteristic root, or, 
in some texts, a /atent root of A. 


Example 9-24 Find the characteristic equation and the eigenvalues 
corresponding to 


ΜῈ 


Solution We have 


so that 
Ι-- 2 
|A— AI] = Ξ- λξ --  --- 6 
3 —A 
Thus the characteristic equation is 
A27—-A—6=0, 
and its roots, the eigenvalues of A, are A = 3 and 4 = —2. 


No consideration will be given here to the interpretation that is to be 
placed on the appearance of repeated roots of the characteristic equation, 
and henceforth we shall always assume that all the eigenvalues (roots) are 
distinct. | 

Returning to Eqn (9-42) and setting A = λὲ, where 4; is any one of the 
eigenvalues, we can then find a corresponding solution vector X; which, 
because of Theorem 9-11, will only be determined to within an arbitrary 
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scalar multiplier. This vector X; is called either an eigenvector, a characteristic 
vector or, a latent vector of A corresponding to 4;. The eigenvectors of a 
square matrix A are of fundamental importance in both the theory of matrices 
and in their application, and some indication of this will be given later in, 
Section 15:8. 


Example 9.25 Find the eigenvectors of the matrix A in Example 9-24. 


Solution Use the fact that the eigenvalues have been determined as being 
A = 3and A = —2 and make the identifications 241 = 3 and A2 = —2. Now 
let the eigenvectors X1 and Xe, corresponding to A; and Ag, be denoted by 


xy) x 02) 
Xi = and Χο ΞΞ . 
x9) Xo!) 


_ Then for the case A = 4, Eqn (9-42) becomes 


(1 — 3) 2 xy) ; 
3 o— allan] τὸ 


whence 
—2x,')) ae 2x2'1) = and 3x1) — 3x2) = 0. 


These are automatically consistent by virtue of their manner of definition, 
so that we find that x1"? = x9‘). So, arbitrarily assigning to χα!) the value 
x1) = 1, we find that the eigenvector X1 corresponding to 41 = 3 is 


A similar argument for λ = λὲ gives 


y ἡ. 2 | x2) : 
3 οι κω πὸ 


whence 
3x1(2) 4 2χοίϑ) = Ὁ. 


Again, arbitrarily assigning to x1" the value x,'°?) = 1, we find that x2'?) = 
— 3/2. Thus the eigenvector Χο corresponding to Az = ~—2 is 


Obviously “Xi and Xe are also eigenvectors for any arbitrary scalar y. 
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9.9 Linear transformations 


Any introductory account of matrices would be incomplete were the basic 
idea of a linear transformation not to be mentioned. Some discussion of this 
important concept has already been offered in Section 9-1, and we now 
develop the idea a little further. Indeed, to recapitulate briefly, it was ex- 
plained there how a linear transformation is just a simple form of mapping of 
the points of one plane into the points of another. This idea is still useful 
when a matrix vector X of order (n Χ 1) is mapped by a matrix transforma- 
tion into what is called its image X under the transformation. In this context 
the elements of X are usually considered to be the components of a vector 
in an n-dimensional space, so that X then specifies a point in that space, 
and X is its image point under the linear transformation. We propose to 
work with the following straightforward definition of such a transformation. 


DEFINITION 9-17 (linear transformation) A general linear transformation 
or point transformation of the vector X of order (n x 1) into the image X of 
order (n X 1) is defined to be a transformation of the form 


X = AX+K, 


where the coefficient matrix A is of order (n X n) and the vector Καὶ is of order 
(n Χ 1). 


The special case considered in Section 9:1 involved a mapping of points 
of the plane brought about solely by a rotation of the plane through an angle 
6 about the origin. In that case the transformation corresponded to K = 0, 
and 


cos@ —sin @ 
A= . (9:45) 


sin 6 cos 9 


This matrix is called an orthogonal matrix because it has the property that 
A’ = Α΄}, and it is representative of a very important class of square matrices. 
The first row of A is seen to contain the direction cosines of Ox’ with respect 
to Ox and Oy, whilst the second row contains the direction cosines of Oy’ 
with respect to Ox and Oy. 

_ More generally, consider the rectangular axes O{x1, x2, x3} which are 
arbitrarily rotated about origin O to form the axes system O{x1’, x2’, x3’}, in 
which the direction cosine of Ox;’ with respect to Ox; becomes νῷ. Then the 
matrix 


γ1, V12 13 
A= ᾿| γὸὶ᾿ 22 23 (9-46) 


Y31 4 ¥32 33 
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is strictly analogous to matrix (9-45), and it is easily seen that X and X are 
related by 


XK = AX. (9-47) 


In the special case that the rotation is only about the x3-axis through an 
angle @ in the sense shown in Fig. 9-1, then 13 = 731 = 732 = veg = Ὁ and 
γ88 = 1, and 


cos@ —sin @ 0 
A = | sin 0 cos@ OQ}. (9-48) 
0 0 1 


When discussing an application of a linear transformation to the theory of 
elasticity in the next section we shall have occasion to refer to this matrix 
again. 


Aside from the rotation transformation characterized by Eqns (9-46) 
and (9-47) there are three other simple transformations worthy of note and 
these are listed below. It is left as an exercise for the reader to verify their 
main properties when related to the plane which give rise to their names. 


1. The identity transformation This is the transformation X = X, and it 
corresponds to the case K = 0 and A = I. Under this transformation X and 
its image X are coincident. 


2. The translation transformation This is the transformation X = X + K, 
and it corresponds to an arbitrary non-zero vector K and A = I. The effect 
of the transformation is to translate X to its image X, without rotation or 
change of scale. 


3. Dilatation transformation This is a transformation XK = AX, in which A 
is a non-singular diagonal matrix. Its effect when mapping X into X is to 
change the scale of the different elements of X without translation or rotation, 
In the special case that all the diagonal elements are equal say to A, where 
A > 1, its effect is one of a magnification of X. 


Example 9.26 If 


x " x’ ~ 15 0 
A= ᾿ X=| |, and A= ; 
y y 0 
deduce the image of the curve y = sinh x under the transformation 


XK = AX. 


Solution We have 
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Fed 5 0] [ἃ 
Ὁ 7 b 7 Ν 

so that 

x =5x and y =2y. 

Thus the image curve of y = sinh x is given parametrically by 
x =5x and γ' =2sinhx 

or, equivalently, by 


y = 2 sinh (x75). 


9:10 Applications of matrices and linear transforma- 
tions 


Η is the object of this final section to indicate a few of the diverse applications 
of the work of this chapter. Of necessity, we will be able to do no more than 
outline this large and fruitful field of study, and for our first example we 
look to the notion of rank to enable us to prove an important result in 
dimensional analysis known as the Buckingham Pi theorem. 


9:10 (a) Application of rank to dimensional analysis—Buckingham Pi 
Theorem 


In many branches of engineering and science, a valuable method of approach 
to difficult problems is via the method of dimensional analysis touched on 
briefly at the start of Chapter 5. In essence, this method seeks first to char- 
acterize a physical situation by forming dimensionless groups from the 
variables involved, and then to determine the functional relationships which 
relate these dimensionless groups. Our contribution will be to the first part 
of this process, for we shall determine how many dimensionless groups exist. 
Let us suppose that a physical situation is described by n variables 
Ui, Us, . . ., Un, each of which corresponds to a physical quantity. Suppose 
also that each of these quantities is capable of expression dimensionally in 
terms of length [LZ], mass [M], and time [7], and that μὲ has dimensions 


[LI[My(7)". 
Then the product of powers 
uy uk? unk, (9-49) 


where ki, ke, . . ., An are real numbers, must have dimensions 


[Ly +++ 4tnkn [Myotis + thnkn [Ty fee beak ny 


Such products of powers will be dimensionless, in the sense that they are 
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pure numbers having dimensions 
[LPIMPITP, 

only if 
atk, + acke +++ ++ ankn =0 
byki + beke -Ἐ +++ bakn =0 
ciky + coke +++ + + Cnkn = 0 


or, equivalentty, if 


ky 
ᾶι a2 ... Gn||ke 
δι be ... bani] ° | =0. (9:50) 
Ch ME: Bae UR | 

kn 


Now if the rank of the coefficient matrix of order (3 Χ n) in Eqn (9-50) 
is r, then we know from the work of Section 9-7 that it is possible to express 


n — r of the variables ki, ke, . . ., kn in terms of the remaining r variables. 
That is to say, it will be possible to form ἢ — r dimensionless quantities 
71, 72, . . +, 7m—-r from the m variables uw, we, . . ., Un. The dimensionless 


variables πὶ are called Pi-variables. Hence we have proved the following 
result. 


THEOREM 9:12 (Buckingham Pi theorem) Let a physical situation be 
capable of description in terms of n physical quantities w1, u2,. . ., Un, Where 
μι has dimensions [L]"‘[M]°'[T]“. Then, if r is the rank of the matrix 


41 a2 ..ε an 
by be eo ee @ bn 3 
CP. 308. τ ΘᾺ 


the physical situation is capable of description in terms of n — r dimension- 
less variables 71, 72, . . ., πη-- formed from the variables uw, v2, . . ., Un. 

This is best illustrated by example. In the slow viscous flow of a fluid 
between parallel planes, some functional relationship of the form 


V = f(k, d, 7) 


exists between the average flow velocity V, the pressure gradient k along the 
flow, the distance d between the planes and the viscosity 7. The dimensions 
of these quantities which will form the matrix in the Buckingham Pi theorem 
are shown in the table below: 
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The rank of the (3 x 4) matrix whose elements comprise the entries in 
this table is 3, as may be seen, for example, by using elementary row 
operations to reduce it to its echelon equivalent 


{ —2 1 - 


in which the determinant formed from the first three columns is non-zero. 

Thus, from the conditions of the theorem, the number of 7 variables is 
4 — 3 = 1. A dimensionless grouping in this case is kd?/7V, and any product 
of powers of the form shown in (9-49) must be a power of this one dimension- 
less group. Hence this physical problem is capable of description in terms of 
the one dimensionless grouping π = kd?/nV. As the velocity profile across 
the flow only depends on the distance x from one of the walls, our result 
implies that a// such flows will be characterized by one curve describing the 
variation of π with x/d. 


9:10 Ὁ) Differentials as linear transformations 


We now consider a generalization of the total differential as described in 
Theorem 5-19 and subsequently used in Theorem 5-22, Let us suppose that 


uj = film, XQ, 0 2 45 Xn) 
ug = fo(x1, AD sal Xn) (9-51) 
Un = fr(x1, XQ, 2 + 45 Xn) 
then it follows from Theorem 5-19 and the properties of matrices that 
δ δ Ο 
diy ay. fa as oft dxy 
Ox, OxXe OXn 
6 oe C 
ἄμε LAE! ae Sa dxe 
= | 0X1 OXxe OXn ; (9-52) 
6 Ο 6 
du, | [ὃ ὃ» chal en 


Ox, χα °  OXn 
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This can be written 
du = A dx (9-53) 


by identifying du, dx with the (7 x 1) column vectors in Eqn (9-52) in the 
obvious manner, and A with the (m x nm) matrix of partial derivatives. 
Viewed in this light, Eqn (9-53) may be seen to be a /Jocal linear transformation 
mapping dx into du. The adjective local is used here because the transforma- 
tion will only be a /inear transformation when A is a constant matrix, and as 
the elements of A are functions of x1, x2,. . ., Xn, they can only be approxi- 
mated by constants in the neighbourhood of any fixed point P with co- 
ordinates {xi?, x2?, . . ., xnP}. For different points P, the transformation A 
will be different, showing that Eqn (9:53) represents a more general type of 
transformation than a general linear point transformation. 

Transformation (9:53) will be one-to-one provided that A! exists, for 
then a unique inverse mapping 


dx = A! dx (9-54) 


will exist. The condition for this is, of course, that | A| 4 0 at the point P. 
This will be recognized as the non-vanishing Jacobian condition already 
encountered in Chapter 5. 


Fig. 952 Spherical polar coordinates. 


By way of example, consider the relationship between the spherical polar 
coordinates (r,¢, 0) and the Cartesian coordinates (x, y, z) illustrated in 
Fig. 9-2 and described by 


x=rsin6cos¢ 


Jy 
Ζ =rcos 8, 


r sin 6 sin d 


Making the identifications uj = x, ue = y, ug = Ζ, and x1 = 7, X2 = 8, 
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x3 = $, a simple calculation shows that Eqn (9-52) will take the form 


dx “sin@cosd rcos@cosé —rsinOsin¢d|[dr 
ἀν =|sin@sind rcos@sing rsinOcosd |j dé}. (9-55) 
dz cos 0 —r sin 6 0 | Ld¢ 


Denoting the square matrix in Eqn (9-55) by A, it is easily established that the 
Jacobian determinant | A | = r? sin θ. Calculating the inverse matrix Ατὶ 
and using it to deduce the inverse mapping we have, provided r? sin 6 - 0, 
that 


dr r?sin?@cos¢ resin? @sing γ3 οἴη θ οοβ θ] ἀχ 
ΞΕ ἢ γϑ[η θοὸς θ οο5 Φ rsin@cos@sing —r sin? θ dy}. 
ἀφ —rsing rcos¢ 0 142 


(9-56) 


9-10 (c) Linear transformation of the stress tensor 


In the mathematical theory of elasticity it is useful to introduce the concept 
of the stress vector associated with any plane element of area within a solid 
body. The magnitude of the stress vector is the force per unit area acting on 
that plane element of area, and its sense is the sense of the force which is 
exerted on that element located at point P, say, by the surrounding material. 
In a solid, unlike a liquid, this force depends on the orientation of the element 
of area, and it is convenient to describe the situation at point P by considering 
elements of plane area normal to each of the unit vectors x1, Χο, x3 of a 
rectangular Cartesian system O{x1, x2, x3}. If the components in the x1, Xe, 
and x3 directions of the stress acting on the element of area with x, as its 
normal are 7x1, Tk2, and 7x3, then the complete information concerning the 
components of stress acting on all three mutually orthogonal elements of 
area at P will be contained in the following table: 


Stress Components at P 


1 2 3 
Surface Normal to x1 TL T12 713 
Surface Normal to xe T21 T22 T23 
Surface Normal to x3 731 732 733 


In general there will be a different table of this type for each point P in the 
solid. 
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The matrix T defined by 
T11 712 713 

T = |721 Τὸ Te23 
731 732 733 


is called the stress tensor at the point P, and it is fundamental to the develop- 
ment of the mathematical theory of elasticity. Let us now indicate how the 
stress tensor transforms when axes centred on P are rotated, since this is a 
situation of considerable practical importance, being related to the deter- 
mination of the directions of minimum and maximum stress at any point in 
a solid body. 


Fig. 9.33 Rotation 9 about x3-axis. 


For this purpose we shall assume that no external moments act on the 
body, for then it can be shown that T is symmetric. In addition, we will set 
713 = 723 = 733 = Ὁ which characterizes what is called a plane state of stress, 
since all the forces then lie in the (x1, x2)-plane. The appropriate rotation 
matrix A relating the system Οὐχ, x2, x3} to O{x1', xe’, x3’} when a rotation 
6 about the x3-axis has been made is that given in Eqn (9-48) (see Fig. 9-3). 

Hence, setting 


F = AT, (9:57) 


then the elements of row i of F will contain the components of the trans- 
formed force vector acting on the element of area with Ox;’ as normal. To 
relate this result to the stress components 7;;’ relative to the new axes 
O{x1', x2’, x3"}, we must use the fact that τι is equal to the projection of the 
force acting on the element of area normal to Ox,’ along Ox;’. To achieve 
this result by matrices we must post multiply A by the transpose F’ of F. 
This is so because row i of A contains the direction cosines of axis Ox;’ and 
row j of F contains the components of the force acting on the element of 
area with Ox;’ as normal, and the rule for matrix multiplication is ‘rows into 
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columns’. Thus if T is the transformed stress tensor, then 
T = AF’ = ATA’, (9:58) 
but as T is symmetric, T’ = T, giving 
T = ATA’. | (9-59) 
Using the fact that 
T11 712 0 
T= τὰ τῷ 0 


0 0 0 


when evaluating the indicated matrix products in Eqn (9-59) then shows that 
the stress components of the transformed stress tensor are 


711’ = 711 COS? 6 + τοῦ sin? 6 — 2712 sin θ cos 6 
ΤῊΣ = 711 Sin? 6 + 722 cos? @ + 2712 sin θ cos θ (9-60) 
ΤΙΣ = (ΤΙ! — 722) sin 8 cos θ + 712(cos? 6 — sin? 9) 
with 
713° = Τοῦ = 733’ = 0. 


These results form the basis of many important studies involving plane 
stress in solids on which no external moment is acting. 


PROBLEMS 
Section 9-1 


91 Suggest two physical situations in which the outcomes may be displayed in 
the form of a matrix. 


9.2 Find the sum A + B and difference A — B of the matrices 


12 3 4 2 31 2 
A=/}2 12 2 |, ΒΞ 0 2 2 0 
12 0 0 | 1-2 1 1 


9.3 Evaluate the following inner products: 


(a2) [2 11 31]; ®t - 7 ai |; ΟΖ -1 3 η 


“ἡ ὦ WwW N 


Ι 
2 
2 
I 


9-4 Evaluate the following matrix products: 
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o3127/ 2 4: Ola το, a 
Gay eo es Weel. Ae ede ἃ οἴ ae 
᾿ ο of § 1 1] 


9:5 State which of the following forms of matrix product are defined and, where 
appropriate, give the shape of the resulting product matrix: 


(a) (7 x 3) x 9); (b) (5 x 3)ῶ x 3); 
(c) (1 x 9)9 x 1); (4) G x 1d x 4). 
9-6 If the matrices I, A, and B are given by 
1 0 0 2 1 3 102 1 
I=/0 1 0], A=it 2 1], and B=|]~1 3 1 O}, 
00 |] 5 1 4 2 12 ἢ 
show that | 


(a) TA=AI=A; 
(0) IB = B but that BI is not defined. 
9.7 Give an example of matrices A and B for which: 


(4) the product A B is defined but the product B A is not; 
(b) the products A B and BA are both defined but are matrices of different 
order; 


(c) the products A B and BA are both defined and are the same order as 
A and B, but they are not equal. 


9.8 Display each of the following sets of simultaneous equations in matrix form: 
(a) 2x +4y+ z=9 | 
x—3y+2z= —4 
χ γ- Ζ2ΖΞ-ΞΕ|Ι, 
(Ὁ); ν  2χ- y=4 
x—3y+2z=-1 
2w + Sx — 32 =0 
4w—- y+4z=2, 
(Ὁ) 3w+ x—2p+4z=1 
w—3x+ y—3z=4 
» - 7x + 2y + 52 = 2, 
(ἃ 2x + γ-- z= dx 
3x + 2y + 42 = dy 
x — 3y + 22 = dz. 
9.9 Let matrices Rg and R, be defined as follows: 
cos 9 —sin 6 cos —sin 
Ηρ -- | | and Ry = - A 


sin 6 cos @ sin ¢@ cos ¢ |" 
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and let 
x χ' x’ 
χε x= | i and x= | i} 
} y 
Then, if 


X’=R,X and X” = R,X’, 
show by matrix multiplication and use of trigonometric identities that 
Rg[RgX] = Ro+gX. 
Interpret this result geometrically. 
Section 9:2 


9.10 Construct the matrices of order (3 x 2) whose general element a;; has the form: 


(a) αι = i? + }3 — 2ῃ); 
(Ὁ) aij = sin i6 cos 70. 


9-11 State which of the following pairs of matrices can be made equal by assigning 
suitable values to the constants a, b, and c. Where appropriate, determine what 
these values must be. 


(a) [1 2 10 Ι. 210 
3 a ὃ 2) and |3 1 2 2}, 
12e¢1 124 1 

(Ὁ) [1 5 a 2 151 2 
2 α53 b| and [2 4 3 4], 
4ά 3 2 ς 4321 

(c) 1 (a + δ) 3 1 4 3 
(a + c) 2 4 and [0 2 4). 

1 2 (bh +c) 122 


9:12 Find the numbers a, b, c, and d in order that the following matrix equation 
should be valid: 


2a 1 5 2. 3) 4 6 4 6 
3 2 —b| +/3 ὦ 4, -ἝἩ ΄[ἰ6 —1 —2). 
3c 4 1 3. ,2. 5 3 6 ό 


9:13 Use Definitions 9.3 and 9-4 to prove that if λ, « are scalars and matrices A 
and B are conformable for addition, then 


(a) AA + B) = AA + 18, 
(Ὁ) AA + μὰ = (A + YA. 
9:14 Determine 3A + 2B and 2A — 6B given that 


᾿ 3 Ἵ Ξ -1 | 
A= and B= . 
2 —1 6 3 —3 2 
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9-15 If 
1 141 0 , 
23 4 
A= , B=;2 1 1 Oj, C= ἢ 
15 6 
2. 3. 1 2 
and 
1 
D=)}2], 
3 


find the matrix products A B and CD. 


9:16 This example shows that the matrix product AB = 0 does not necessarily 
imply either that A = 0 or that B = 0. If, 


i -1 1 1 23 
A= |-3 2 ~—1/ and B=/2 4 6], 
—2 1 0 1 2 3 


find A B and BA and show that AB - BA. 
9.17 Show that the matrix equation 


AX = K, 
where 
1 3 1 x1 ] 
A=j1 1 2], X=/xel, and K=/]2/], 
2 2 0 x3, 3 
may be solved for x1, x2, and xs by pre-multiplication by B, where 
πὲ 4 ἐ 
Bel ah =F 
0 ἐ -Ὦ 


9.18 Use matrix multiplication to verify the results of Theorem 9-2 when A, B, 
and C are of the form 


1.3 2 —] 2 1 ἘΞ δ "4 
Α-Ξ [’Ι0 1 4, B= 3 —-2 --1},͵, and C= 0 2 4}. 
2. 300d Ϊ 4 2 oe ae 


9:19 If A is a square matrix, then the associative property of matrices allows us to 
write A” without ambiguity because, for example, A? = A(A A) = (A AJA. If 


λ cosh χ sinh x 
sinh x cosh x |’ 
use the hyperbolic identities to express A2 and A® in their simplest form and 
use induction to deduce the form of A”. | 
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9:20 Transpose the following matrices: 


1 4 19 
τ Hea χε, Δ (c)| 4 0 2 
a : > (ec ᾽ 
(a) | 4301 
19 2 4 
ὃ 3 -2 : 
(4) | —3 0 Thy MOD) | 
2 --,͵ 0 0 


9:21 Use Definition 9-7 and Theorem 9.3 to prove that: 


(a) the sum of a square matrix and its transpose is a symmetric matrix; 
(b) the difference of a square matrix and its transpose is a skew-symmetric 
matrix. 


Illustrate each of these results by an example. 
9-22 Verify that (A B)’ = B’ Α΄, given that 
—4 2 


1 4 7 
A= and B= 3.1}: 
9 -3 1 


-5 6 


9.23 If a matrix A contains complex numbers as elements it is said to be a complex 
matrix. Its complex conjugate is denoted by A* and is defined to be the matrix 
obtained from A by replacing each element by its complex conjugate. Show 
from this and the definitions given in the text that: 

(a) (A*)* = A; 

(b) (A + B)* = A* + B*; 

(c) (uA)* = fA*, where μὶ is any complex number and 4 is its complex 
conjugate. 


9:24 Find the complex conjugates of the matrices A and B, where 


1 εν | i a 
A= and B= 
3—2i i 1+i 1- Δ 


and, taking « = 1 — i, use them to verify the results of the previous problem. 


Section 9-3 ὃ 
9.25 Evaluate the determinants ᾿ 
10 3 1. ἃ 5 \ 
(a) ae ; (0) 1712 0 5; © 3 1 5 |. 
1a : 309 -5 0 -5 
($28 Without expanding the determinant, prove that 
1+ ai ay ay . 
42 1 + ae az | = (1 + a1 + a2 + az). 


a3 a3 1 + as 
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9.27 Use Theorem 9-4 to simplify the following determinants before expansion: 


42 61 50 0 9 3 
(a) |Al/=| 3 O 2]; () JAl=/2 16 4}; 
4 6 5 $1 2 1 
(c) a 
|A;=]5 17 56}. 
4 1 7 
9-28 Without expanding prove that 
x2 +a? a.as, ΠΝ 
ἄραι ; x2 + ae? ΝΕ | = χέ(χξ + ai? + a2? + as?). 


ain Ee xt ta? 
9:29 Show without expansion that | 


a2 δῇ οἷ 
a δ. ὁ 15 (a — bya — οὐδ — ©). 
1 1 1 


This determinant is called an alternant determinant. Illustrate the result by 
means of a numerical example and verify it by direct expansion. 


9-30 Prove that 
sin(x +737) sin x cosx 
[A] =| sin@®+42) cosx sinx 
1 | a l—a 
is independent of a, and express it as a function of x. 


9.31 Find the minors Mi; and cofactors Ai of each element ai; in the matrix 


- -# -4 
# αὶ -#1. 
ι πῇ 4 


9.32 If A is an arbitrary matrix of order (3 x 3) with general element ai; and co- 
_ factor Ay, show by direct expansion that: 
(a) a11A31 + ai2432 + a13A33 = 0; 
(b) a13A12 + aesA22 + as3sAse = 0. 


9:33 Use the Laplace expansion theorem to expand determinant (b) in Problem 
9-25 first in terms of elements of the third row, and then in terms of elements 
of the third column. 


9-34 If A is a matrix of order (3 x 3) and A’ is its transpose, prove by direct 
expansion that 


|A| =] A’|. 
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Use the Laplace expansion theorem to prove that this result is true for a 
square matrix A of any order. 


9-35 Verify by direct expansion that for any square matrices A and B of order 
(2 x 2): 


[AB|=|A|| Bl. 


This result is, in fact, valid for square matrices A and B of any order. 


Section 9-4 


9-36 Which of the following sets of vectors are linearly independent, and where 
linear dependence exists determine its form: 


3 0 0 
(a) Ci =|0/, Co=|—7], Cs=] 0); 
0 0 15 
(0) Ri=[{l 9 --2 14], Re=[-—2 —-18 4 —28]; 
2 I 1 5 
(c) Ci =|]1}], Co=]1], ῷΞ [2|, Ca=] 6]. 
0 7 1 4 
9-37 Test the following matrices for linear independence between their rows or 
columns: 
i 2 -1 0 0 2 3 1 L226 315 
23 141 - 0-1 2 21 2 0 
ee a τὰ» Ὁ eee ἃ Sah oe oa 
0 1 2:3 -1 —2 2 90 53.}7Άἴ 7 


9.38 Find the rank of the following matrix: 
2 1 0 4 3 

Ξε] ὦ ἢ. ἃ ey ee 7 
7 -4 —12 14 —-12 


9-39 Construct an example of a matrix of order (4 x 3) which is (a) of rank 2, and 
(b) of rank 3. 


Section 9.5 
9.40 Show that adj A = A when 
—4 - —3 
A= 1 oO 1}. 
4 4 3 


9-41 Find the matrix adjoint to each of the following matrices: 
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12 3 12 3 
a b 
(a) 12 3 21; (b) } 3 41; © | 
cd 
3 3 4 1 4 3 


9-42 Set 


k all: 5} 7 


and equate corresponding elements to determine the inverse of 


Ν᾿ 


9.43 Find the inverse of 


3 —-2 ~—!] 
A= | --4 1 --ι|. 
2 O 1 
Verify that: 
(a) A}A=AA =]; 
(Ὁ) (A) =A. 
9.44 Given that A and B are 
1 2 1 1 —1 2 
A=/{1 4 2) and B=/0 2 4], 
03 2 1 0 3 


verify that (A B)-! = B-! A~}, 


Section 9-6 


9.45 Find dA/dt and determine the largest interval about the origin in which it is 
defined, given that 


2t3 tanr cost 
A(t) = : 
3 4—7f2 14+f 


9.46 Given that 
AG) cosh ¢t sinh ἃ BU) t J | 
= an = ὃ 
sinh ¢ cosht 2t (3 
verify results (d), (f), and (g) of Theorem 9-9. 


9.47 Show that for the matrix 
cosf sin 


A(t) = | 


sin ὦ cos t 


it is true that (d/dt)A?2 = 2A(dA/dr), but that this is not true for the matrix 
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1 ¢ 
A(t) = ke ab 


9-48 Show that if A(¢) is a non-singular matrix, then 


dA“! dA 
a me τ Aa 
dt ᾿ 4 ὰ 


Verify this result when 
cost -sint 
ἊΝ | 
sin ft cost 
Section 9-7 


9.49 Solve the following equations using Cramer’s rule: 
xi+ ΧΩ χει 7 


2X1 — xXe+2x3 = 8 
3X1 + 2x2 — x3 = 11. 


9°50 Solve the equations of the previous example using the inverse matrix method 
and compare the task with the previous method. 


9-51 Solve the following equations using Cramer’s rule: 
x1—xXea+ x3— X41 = 1 
2X1 — Xo + 3x3+ x4 = 2 
X1 + xo + 2x3 + 2x4 = 3 
xitxe+ Xst Xa 3. 
9.52 Write down the augmented matrix corresponding to the equations: 
2x1 — χα + 3x3 = 1 
3x1 + 2x2 -- x3 =4 
X1 — 4xe + 7x3 = 3. 
Show, by reducing this matrix to its echelon equivalent, that these equations 
are inconsistent. 
9.53 Write down the augmented matrix corresponding to the equations: 
3x1 + 2x2 -- x3 = 4 
2X1 -- Sxe+2x3= 1 
5x1 + 16x2 — 7x3 = 10. 
Show, by reducing this matrix to its echelon equivalent, that these equations 
are consistent and solve them. 


9:54 Solve the following equations, in which « is an arbitrary constant, by reducing 
the augmented matrix to echelon form: 


X1+ axX2- axzs= 1 
αχὶ + ΧΩ 2ax3 = —4 
αχὶ — axe + 4x3 =~ 2, 


Consider the effect of « on the solution. 
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9.55 Solve the following homogeneous equations, in which « is an arbitrary con- 
stant, by reducing the augmented matrix to echelon form: 


αχι -- Xo— x3=0 
—Xi $+ ἀχὸ πὶ x3=0 
--χι —- ΧΟ ox3 = 0. 


Consider the effect of « on the solution. 


9.56 Solve the following equations using Gaussian elimination: 


1:202x1 -- 4:371x2 + 0-651x3 = 19-447 
—3:141 x1 + 2:243x2 — 1:626x3 = — 13-702 


0:268x1 — Ο'876χο + 1:341x3 = 6.849. 
9.57 Discuss briefly, but do not solve, the following sets of equations: 
(a) χι τ xe=1 (Ὁ) x1 + x2 = 1 
2χι -- xe=5; 2x1 — X2 τῷ 5 
x1 — x2 = 0; 
(cc) x1+ xe=1 (d) x1+x2— x3 =0 
2X1 — Xe= 5 2x1 — χὰ — 5x3 = 0. 


—XxX1 — 2x2 = 0; 


Section 9-8 


9.58 Write down the characteristic equations for the following matrices: 


1 0 2 
@ A= |. df (Ὁ) A=/2 1 1 
a “5 γ᾽’ ΞΞ ὡ 
0 2 1 


9:59 Find the eigenvalues and eigenvectors of 


_ 
A= 
= 0 


9.60 Prove that the eigenvalues of a diagonal matrix of any order are given by the 
elements on the leading diagonal. What form do the eigenvectors take. 
Section 9-9 


9.61 Verify that the matrix A in Eqn (9-46) is orthogonal, and justify the assertion 
that X = AX describes the effect of a general rotation of the rectangular 
cartesian axes O{x1, x2, x3}. 


9.62 Justify the name reflection transformation of the plane when applied to a 
transformation of the form 


X = AX, 


where either 
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9-63 Show that if 


~ 


X = AX, 
where 
Pe 6 —sin ͵ 
ΑΞ | ; 
sin @ cos 9 
then 
X’X = ΧΧ, 
where the prime signifies the transpose operation. Interpret this result geo- 
metrically. 
9-64 If | 
ee fale 
| . ΓΖ | 2. 2 
x= |"), x=[2]. ana a= Mem οὐ τ 
» j es il 
V2 ν2 
deduce the image of the curve y = x? under the transformation 
X = AX. 


Is the shape of the curve changed ? 
9-65 If 


x= |”). =|"). and ἜΝ A 
y y 0 3 


deduce the image of the curve y = x2 + 2x + 1 under the transformation 
X=AX. 


Describe the effect of the transformation in geometrical terms. 


Section 9-10 


9.66 How many dimensionless groups of variables (+ variables) characterize a 
physical situation described by: 


(a) the four physical quantities: work (L?MT~-), viscosity (L-1MT-}), 
pressure (L-1MT~-?) and mass transfer rate {Μ7Τ 1); 

(Ὁ) the five physical quantities: length (ZL), viscosity (L-1MT-}), velocity 
(ΣΤ ἢ), area (L*) and pressure (L-!MT~2). 


9-67 Express in matrix form the relationship between the differentials dx, dy and 
du, dv, given that 


u = sinh (x3 + y3), v = cosh (x3 — γ5). 


For what values of x and y does this transformation fail to have an inverse? 
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9-68 Given that 
u=x?+2y+1, v = x8 — 2χγ + γ8 
and 
p=sin(u+v), 4 = cos(u— »d), 


display in matrix form the relationship between the differentials dx, dy and 
du, dv and between du, dv and dp, dg. Use matrix multiplication to express 
directly the relationship between the differentials dx, dy and dp, dq. 


9-69 Justify the matrix equations (9-55) and (9-56). 
9°70 Verify that the square matrix in (9-55) is an orthogonal matrix. 


9.71 Perform the calculations required in (9-59) to give the transformed stress 
tensor components (9-60). 


Functions of a complex 
variable 


101 Sequences of complex numbers and limits 


When considering a definition of a sequence {zp} of complex numbers, we 
should first examine to what extent the work of Chapter 3 on sequences of 
real numbers is still relevant to complex sequences. 

It will obviously be necessary to formulate new definitions, and this will 
be our next task. However, since a sequence {un} of real numbers is just a 
special case of a sequence {z»} of complex numbers, any new definitions must 
be compatible with the corresponding situations in Chapter 3 when related to 
real sequences. Therefore, the behaviour of sequences of complex numbers 
will be directly determined by the behaviour of the sequences of real numbers 
that may be formed by considering separately the real and imaginary parts 
of {Zn}. Thus if z, = [1 + (1/n)] + i(1/n?2) we would need to consider the 
two real sequences {1 + (1/n)} and {1/n?} associated with {zn}. 

Here we must note that expressions such as ‘monotonic’, ‘finitely oscil- 
lating’, and “*bounded above’ cannot be applied to sequences of complex 
numbers as they cannot be ordered like the real numbers. 


DEFINITION 10-1 (limit of complex sequence) The infinite sequence 
{Zn} of complex numbers zy = Xn + iyn will be said to converge or tend to the 
limit y = uw + iv if, and only if, for every ε > 0 there exists a number N, 
such that for n > N, 


| V—Zn | < ε. 
When the sequence {z,} is convergent to y in this sense we shall write 


lim 2n = γ. 


ἢ} --ον © 


This definition is easily seen to reduce to Definition 3:3 when applied to ἃ 
sequence of real numbers, for then the complex modulus and the absolute 
value become identical in meaning. 


The essential difference between Definitions 3.3 and 10-1 is embodied in 
the following theorem. 


THEOREM 10:] (conditions for convergence) Let {zy} be an infinite sequence 
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of complex numbers Zp = Xn + iva. Then necessary and sufficient conditions 
for 


lim Zn = γ, 


n—- © 
where y = uu + ἵν, are that 


lim Xn = μ and lim yn = ν. 


Proof A paraphrase of this theorem would be that if {zn} converges to y, 
then the sequence of the real parts of {zn} converges to the real part of y and 
the sequence of the imaginary parts of {zn} converges to the imaginary part 
of y. To establish the necessity of the conditions of the theorem suppose that 
for some positive number ε, | y — Zz, | < e forn > N. Then 

ly — Zn | =| ut iv — (xn + iyn) | = | (4 — xn) + iy — yn) |, 
amd so by the definition of the modulus of a complex number, 

(γ — Zn) = [ὦ — Xn)? + (9 — γ.)}}} 2, 
Neglecting first the positive term (μ — χῃ)δ, and then the positive term 
(ν — yn)?, shows that 

ly—zm|Sle—x| and |γ-- Ζ [Ξ]» τ γε! 
Hence [μ — χῃ [-Ξ ε and | ν -- γῃ |<e for n> N showing, by virtue of 
Definition 3-3, that 


limx, =p and lim yn = ν. 


N-* D n> 
The sufficiency of these conditions is almost immediate. If 


lim xn = μ' and lim yn = ¥, 


R-> CO R—-*> BO 


then for any positive ε choose N such that | ~ — xn | < eand|v— yal <e 
for n > N. Then, as 


fy — zn | = [a — XW)? + Ὁ τ yn) PP, 
it follows that 
ly — zn | < /(2e2) = εΨ2. 


This establishes our result because « was arbitrary and so | » — zn | can 
always be made arbitrarily small by a suitable choice of «. 

The fact that a sequence of real numbers can only have one limit implies 
the uniqueness of 4 and ν, and hence the uniqueness of y. Consequently we 
have arrived at the following result. 


Corollary 10-1 If the sequence {zp} of complex numbers is convergent, then 
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it only has one limit. 


Example 101 Examine each of the following sequences {Zn} of complex 
numbers for convergence and, where appropriate, determine the limit. 


] 
(a) Zn = (1 +*) + i sin —; 
n 2 


oe ee ) n (ἘΞ a. 


(c) Zn ΞΞ sin = + i/n[V(n + 6) — νη]. 


Solutions We shall obtain our results by means of a direct application of 
Theorem 10:1. 
(a) Making the identifications 


eee _ at 
Xn = - Ja = sin > 


we see that lim x, = 1, whereas the sequence {yp} has no limit since yp 
> © 


assumes successively only the three values 1, 0, and —1. Hence the sequence 
{Zn} does not converge and so has no limit. 
(b) Making the identifications 


2n + 1 (“—) 
Xn = 2 Yn = 


3n n 
we see that 
limxn=% and lim yn = 
n—> CO n—> 0O ᾿ 


Hence the sequence {z,} converges and 


lim Zz, = € +i. 


n> © 


As the numbers # and 1 are not members of their defining sequences {xz} 
and {yn}, the complex limit y is not included as a member of the sequence 


{Zn}. 


(c) Make the identifications 


xn =nsin=, yn = Vnlvin + 6) — νη] 


Then 


lim Xn = 2 


n+ © 
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and 


6 
lim ya = lim </n. νη {{π2|- 


= lim | +5-240(5) - 1 
2 ἡ n 


= 3. 
Thus the sequence {zn} converges and 
lim Zn = 2 + 3i. 


For the same reason as in (Ὁ) above, the limit 2 + 3115 not a member of the 
sequence {Zn}. 

Arguments essentially similar to those given in Theorem 10-1 establish 
results from complex sequences that are strictly analogous to those of 
Theorem 3-1. We state them below without proof. 


THEOREM 10-2 If it can be shown that {wn} and {z,} are two convergent 
sequences of complex numbers with lim wz = A and lim Ζῃ = y, then 


T— © N—> 
(a) wi + 21, We + Ze, W3 + Z3,. . . iS a Sequence such that 


lim (Wn + Zn) = A+ γ; 


nh—+> ὦ 
(b) w1z1, weze, w3z3,. . . 15 ἃ sequence such that 


lim Waza = Ay; 


n— 0 


(c) provided y + 0, wi/z1, we/z2, wa/z3, . . . is a sequence such that 


, Wn 
Ϊ — | = Aly. 
ὍΝ ( Ζῃ ) ly 


Example 10-2 If wa =[n(1 +/)/(1+1)] and zp = (1/n) + [(n? + 1)/ 
(2n? + 3)]i, find (a) lim (Wn + Zn); (6) lim (Waza); and (6) lim (Wa/zn). 


— © n— ὦ 
Solution By inspection we have 


limw,=1+i and lim Ζ,, = 4i. 


ἢ. --ρ το ἤν 


Hence by Theorem 10-2, 


(a) lim (wa + Ζη) =(1+i)+4i=1 ἘΞ: 


(b) lim (WnZn) = (1 + itd = Κὶ -- 1); 
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(c) lim (3) - 3: = 2 - 24, 


If the terms of a sequence are plotted as points in the complex plane, 
Theorem 10-1 and its Corollary imply that when that sequence is convergent, 
it will have the property that with increasing n its points will cluster ever 
closer to the one point that represents its limit y. In other cases it may happen 
that although a sequence is not convergent, nevertheless when its terms are 
plotted in the complex plane they cluster around two or more distinct points. 
By analogy with sequences of real numbers, these points will be called /imit 
points of the complex sequence, and for their definition they require the 
notion of a neighbourhood of a point. 

Accordingly, we shall use the term a neighbourhood of the point ζ in the 
complex plane to mean the interior of any circle centred on ¢. This idea 
enables us to define a limit point. 


DEFINITION 10-2 (limit point) The point ¢ will be called a limit point of 
the sequence {Z,} of complex numbers if every neighbourhood of ¢ contains 
at least one point of {zn} other than the point ¢ itself. 


It is an immediate consequence of this definition that every neighbourhood 
of a limit point ζ of {zn} contains an infinite number of points of {zn}. We 
again emphasize that Theorem 10-1 together with its Corollary imply that a 
convergent sequence {zn} of complex numbers can have only one limit point. 


Example 10-3 Identify the limit points of the sequence {zn} where 
2 =) + i yn (1 += sin] 
Zn = - - - μ᾿ ΞΞ 1: 
“ἢ n 2 
Solution Make the identifications 


is d 1» (1 += sin) 
= —— = (— — Sin — ]- 
ὼ n st 4 n 2 
Then {xn} converges to the limit 2 and thus has one limit point, whilst {yx! 
does not converge but has the two limit points | and ~1. Hence the sequence 
{zn} has the two limit points 2 + i and 2 — i. 


10:2 Curves and regions 


The notions of a curve and a region in the real plane may be immediately 
extended to the complex plane. As a closed and not necessarily smooth curve 
is a connected set of points which serves to de-limit two areas of the plane, 
which we shall call the interior and exterior regions relative to that curve, we 
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ought first to define a curve C in the complex plane. It is frequently convenient 
to give a parametric representation by expressing C as the set of points 


Ζ = x(s) + iy(s) fora<s<b, (10-1) 


where x(s) and y(s) are continuous real functions of the parameter s. It 
should be apparent from Section 2:5 and subsequent work that the require- 
ment of continuity for the real functions x(s), y(s) will ensure that C is a 
continuous curve (that is, unbroken), but that it does not necessarily possess 
a tangent at every point. As a simple illustration C might be a rectangle, for 
then tangents would not be defined at the corners though the curve would be 
continuous everywhere. We shall return to these general matters later when 
a continuous function of a complex variable has been defined. For 
conciseness let us henceforth call such curves C, continuous curves. - 

For a less trivial example, suppose that the curve C in the complex plane 
is defined by z = x(s) + iy(s), where 


3 
x(s)= sins for —jn<s< > 


sin?s for —in<s<4da 


(5) = 3 
᾿ I] for Jn <s <=. 


Ε Interior region 


Ξε ἢ O 1 
Fig. 1001 Continuous curve C having no tangent defined at Ρ and Q. 


Then it is readily seen that C is the continuous closed curve comprising the 
parabola y = x? in the interval —1 << x < 1, together with the points of the 
line y = 1 common to that same interval. The curve C is shown in Fig. 10-1 
and it is continuous everywhere, though it is not smooth everywhere for no 
tangent can be defined at points P and Q. The darkly shaded area in that 
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Figure comprises points which are interior relative to C and form the interior 
region, whilst the lightly shaded area comprises points which are exterior 
relative to C and form the exterior region. When speaking in terms of regions, 
the points comprising the curve C itself are usually called the boundary 
points and they may, or may not, belong to a region. 

A parametric representation of a curve C is not always the most convenient 
method for its description in the complex plane and, on occasions, it is better 
to identify the points z comprising a curve directly in terms of z itself. When 
necessary, regions are usually defined in the complex plane by means of a 
combination of curves and inequalities, as was done in the real plane. 


Example 10-4 Describe the curve C defined by the equation 
|z—2|=3 
and use the result to define the region exterior to C. 


Solution This expression defines a connected set of points that all have a 
modulus 3/2 relative to the point z = 2 as origin, that is to say, the set of 
points which are all distant 3/2 from the point z = 2. Hence the equation 
| z — 2 | = 3/2 describes a circle C of radius 3/2 centred on the point z = 2. 
Algebraically, the same result is obtained by writing z = x + iy, when 
|z—2| =| (x — 2) + iy |, so that from the definition of the modulus of a 
complex number, | z — 2| = 3/2 is seen to be equivalent to the algebraic 
equation (x — 2)? + y? = 9/4. This is a circle of radius 3/2 centred on the 
point (2, 0). The region exterior to C is the entire complex plane less the 
points lying in and on this circle. 


Example 10-5. Describe the region interior to and including the curve C 
defined by 


arg (z — 1) — arg (z — i) = ἐπ, 
and also satisfying the inequalities 


4+<Rez<? and Imz>0. 


Solution Consider the construction in Fig. 10-2 (a) in which P is the point 
z = ], Q 15 the point z = 7 and R is a general point z. 

Simple geometrical arguments then establish that the angle y is related 
to the angles « and β by the equation 


yonr+a— β. 
However, the line PR is the vector z — 1, whilst the line QR is the vector 
z — i, so that arg (z — i) = « and arg (z — 1) = β. Since by the conditions 


of the problem we must have β — « = ἐπ, it follows that y = ἐπ. The angle 
QRP is thus a right angle and hence the curve C must be a semi-circle drawn 
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y Complex plane Complex plane 


(a) (b) 


Fig. 10-2 Region in complex plane: (a) boundary curve; (Ὁ) region interior to C 
and satisfying stated inequalities. 


from P to Q with PQ as its diameter. The semi-circle must lie above the 
diameter PQ, since were.the general point R to be taken below that line the 
equation relating the arguments would no longer be satisfied. To define 
the lower semi-circle the following condition would be needed: 

arg (z — 1) — arg (z — i) = --ἰ᾿π. 

To complete the solution to the problem it is now necessary to interpret 
the inequalities. The inequality 4 < Rez< ? describes the narrow strip 
bounded by the lines x = } and x = 3, with the points of the line x = } 
excluded from consideration. The inequality Im z > 0 is the half plane above 
and including the x-axis itself. Figure 10-2 (b) presents a composite diagram 
with the shaded area representing the region satisfying all the conditions of 
the problem. Boundary points belonging to the region are indicated by a 
heavy line and those excluded by a dotted line. 

Notice from this and the previous example that there is more than one 
way of specifying a given curve and region. The condition 


arg (z — 1) — arg (z — i) = ἐπ 
is an alternative expression of the condition 
4/2 
covey gwen) δή baer a 
[ot Δ 5 
with Rez >0, Imz>0O, 
which, in turn, is an alternative expression of the algebraic condition 
arty πὶ 
with x>0, y>0O. 
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10:3. Function of a complex variable, limits, and 
continuity 


In Chapter 2 we used the term ‘a real valued function of a real variable’ to 
mean any rule that associates with each real number from the domain of 
definition of the function a unique real number from the range of that func- 
tion. Symbolically, if D denotes the set of points in the domain of a function 
f, and R denotes the set of points in the range of f, this relationship or mapping 
is given by 


R= f(D). 


These ideas still hold good when the domain D and the range R include 
complex numbers. Thus if z is any point in D, and w is the unique number 
assigned to z by the function f, we write 


w = f(z). | (10-2) 


The number z = x + iy is allowed to assume any value in D and 90, if 
desired, could be called a complex independent variable, when w could then 
properly be called a complex dependent variable. Usually we shall simply 
refer to z and w as complex variables. It must be appreciated that, like z, the 
variable w has a real part and an imaginary part, both of which are in general 
dependent on x and y through the variable z = x + iy. We summarize these 
ideas formally as follows. 


DEFINITION 10-3 (function of a complex variable) We shall say that fis a 
function of the complex variable z = x + iy, and write 


w = f(z), 


if f associates a unique complex number w = u + iv with each complex 
number z belonging to some region D of the complex plane. 


Specific examples of functions of a complex variable are: 

(a) w=iz +1; (Ὁ) w= ΖΖ; (Ὁ) w= 22 4+ 22 - 1; (4) w = I/(z — 2); 

(6) w = sin z. 

With the exception of (d), which is not defined for z = 2, these functions are 
- defined for all z. 

The difference between a function of a complex variable and a real valued 
function of a real variable is made clear by expressing these examples in real 
and imaginary form. Thus writing z = x + iy and w = uw + iv we find: 

(a) w= i(x + iy) +1 =(1 — y) + ix, showing that u = 1 —y,v=X; 

(b) w = (x + iy)(x — iy) = x? + y?, showing that u = x? + y2, v= 0, 

This is an example of a function that always maps a complex variable 

into a real variable. 
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()w=(x+iy?+2xe4+ i) +1 =? + 2χ -- γῆ - 1) τ Ἦ ἰῶν τ 
2xy), showing that u = x? + 2x — y2+ 1,v = 2y(1 + x); 

(d) w = I/( + iy — 2) = [( — 2) — ἐν] (αϑ + γὲ — 4x + 4), showing 
that u = (x — 2)(χ3  γὃ — 4x + 4), 0 = —y/(x? + y? — 4x + 4), pro- 
vided only that x #4 2 and y 4 0; 

(6) w = βίῃ Ζ = sin (x + iy) = sin x cos iy + cos x sin ly, and so using 
the results of Problem 6-33, that cos iy = cosh y, sin iy = isinh y, 
we arrive at w=sinxcoshy+icosxsinhy. Thus in this case 
u = Sin x cosh y, v = cos x sinh y. 


Any function of x, y and complex constants that gives rise to a unique 
complex number when x and y are specified defines a function of the complex 
variable z by virtue of the relationship z = x + iy. For suppose that 


(x+y +1) + ix — 2y) =f), 


then to determine f(z) when z = 1 + 2i we simply write x + iy = 1 + 2i, 
showing that x = 1, y = 2, after which it follows from the form of f(z) 
that f(1 + 27) = 4 ~ 3i. 

Our Definition 10-1 of a limit of a sequence of complex numbers extends 
without difficulty to include the concept of a limit of a function of a complex 
variable, In essence, we shall say that f(z) has the limit wo as z > zo and will 
write ; 

lim f(z) = wo (10-3) 

Ζ--»Ζ0 
when, for any small ε > 0, we can always ensure that | f(z) — wo| « ε by 
confining z to some suitably small circular neighbourhood | z — zo |< 
of the point zo. That is to say f(z) can be made arbitrarily close to wo by 
taking z sufficiently close to zo, irrespective of the manner of approach of 
Ζ to Zo. As in the real variable case, we do not require that (2) be defined at 
Zo or, if it is, that f(zo) should equal wo. Expressed formally this becomes: 


DEFINITION 10-4 (limit of a function of a complex variable) The function 
J (2) will be said to tend to the limit wo as z > zo, and we shall write 

lim f(z) = wo, 

z—% 


if, and only if, for any 8 > 0 there exists a ὃ > 0 such that 


| f(z) -—wo|<e ~~ when |\z—zo|<6 withz zp. 


This form of statement should be compared with that in Definition 3-8 
relating to a real valued function of two real variables. There is no essential 
difference, since the complex modulus is equal to the distance function p 
used in that definition. 
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Example 10-6 Prove that 
lim (z? + 1) = 1 + 23, 


2-.-.1- 
Solution The result is self-evident since the function f(z) = Ζξ - 1 is 
uniquely defined for all z, but let us prove it using Definition 10-4. 
(2 -- A+ 2) ΞΞῚ “Ζξ -- 21] ΞΞ 12 -- [-Ἡ ἢ][Σ Ἑ ( +9) | 
and from the properties of the modulus this becomes 
Ι ΧΧ2) -- [- 20 ΞΞ [2 -- 1 -τ-ὀἰἰ[ἰ.ἸΖΈῈἘἘ 1 -ἘἸ| 
| =|z—1—i|.|z—1—i+20114+ | 
<|z-—1l—i|{|z-—1l—f|4+2]1+4/i]{}. 
Hence we may make | f(z) — (1 + 2/)| «- ε, where e > 0 is arbitrarily 
small, provided that we choose the number 6 > Osuch that | z—-1—~i| <6 
and 6{6 + 2|1+i|}< ε. The conditions of Definition 10-4 are satisfied, 
thereby establishing that 1 + 2] is the limit. In other words, as z approaches 


the value !+ 7, so the function f(z) = z?+ 1 approaches the number 
] + 2i, which is its limit. In this case it also happens to be true that lim f(z) 


= f(Zo). 5110 
Example 1057 Prove that 


244 
lim [ τε ) = ai 


221 \Z — 2i 


Solution Unlike the previous situation, the function f(z) = (z2 + 4)/(z — 2i) 
is not defined when z = 2/. To establish the desired result we notice that 


z#+ 4 — 412 — ὃ 


fe) = 4i| = 


2 -- 2] 
2— 412 -- 4 (z — 2i)? 
ΞΡ cea ete aera oh ἘΞῚ ee ΤῊΝ 
2 -- 2] = z—2i (2 i 


Thus we can ensure that | f(z) — 41} < ε by taking | z — 2i| < ὃ, where 
here ὃ = ε. The conditions of Definition 10-4 are satisfied, and thus we have 
established that 
244 

lim (: as ᾿ = 4i, 

291 \Z -- 21 
despite the fact that the function f(z) = (z? + 4)/(z — 21) is not defined at 
2 ΞΞ 2i. 

The results of Theorem 10-2 generalize to give limit theorems for functions 
of a complex variable. 
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THEOREM 10:3 (operations on limits of complex functions) If f(z) and 
g(z) are two complex functions for which 


lim f(z) = vo and lim g(z) = Wo, 


Zz 29 22g 


then 
(a) lim [f(z) + g(z)] = vo + το; 


(Ὁ) lim f(z) g(z) = vo wo: 
(c) provided 19 + 0, lim [f(z)]/[g(2)] = vo/wo. 


The proofs of these results follow directly from Definition 10-4 and are 
left to the reader. 


Example 10-8 Apply the results of Theorem 10-3 to the functions f(z) = 
z#+2z+ 1 and g(z)=1-— iz to determine the limits of f(z) + g(2), 
J (2) g(z), and f(z)/g(z) as z > i. 


Solution The functions f(z) and 2(z) are defined for all z and so it is easily 
seen that 
lim f(z) = lim (z? + 22 + 1) = 27 


ee 
and 


lim g(z) = lim (1 — iz) = 2. 
These results, which have been obtained by direct substitution, may be verified 
by using Definition 10-4, as in Example 10-3. Results (a), (b), and (c) of 
Theorem 10-3 may thus be applied te yield: 


(a) lim [f(2) + 5(2)] = 21 -- ὃ; 
(b) lim f(z) g(z) = 4i; 


(c) as lim g(z) = 2 £0, lim —— = i. 
z~+i δ zi 2(z) 
It is now a simple step to extend the idea of continuity for, as with real 
valued functions of a real variable (c.f. Definition 3-9), we shall say that the 


function f(z) is continuous at zo if lim f(z) = wo exists and f(zo) = wo. We 
z— 29 . 
thus arrive at the following statement. 


DEFINITION 10:5 (continuity of a function of a complex variable) The 
complex function f(z) will be said to be continuous at zo if: 
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(a) lim f(z) = wo exists, 


ζΖ-νζὸ 


and 
(b) f(zo) = wo. 


A complex function will be said to be continuous in a region of the complex 
plane if it is continuous at all points of that region. 


Example 10-9 Prove that the function f(z) = a+ δ is continuous 
everywhere. | 


Solution If zo is any complex number we have 

| f(z) — f(zo) | = | a+ bz —a — bz | = |b)|z— zo], 
so that for any ε > 0, 

f(z) -—f(zo) |<e if |z—z9| <4 
provided that we take ὃ = εἰ] b |. We have proved that 

lim f(z) = a + bzo, | 


Z— 2g 
which is condition (a) of Definition 10-5. Condition (b) is obviously true as 
(Zo) = a + bzo for all zo. As zo was arbitrary, it follows that we have 
proved the required property of continuity for f(z). Notice that by first setting 
ὃ = Ο and then setting a = 0, ὁ = 1, the continuity of the functions f(z) = a 
(constant) and f(z) = z follow as special cases. | 


Example 10-10 Prove that the function f,(z) = z”, where n is a positive 
integer, is continuous for all z. 


Solution The proof is by induction. In the previous example we proved as a 
special case the continuity of fi(z) = z. If we assume that fin(z) is continuous, 
then since fin+i(z) = 27,1 = z.2™ = fi(z). fin(z), it follows directly from 
Theorem 10-3 (Ὁ) that fin+i(z) is continuous. Thus if P(m) is the property that 
Jnf{2) 1s continuous, we have proved directly that P(1) is true and also that if 
P(m) is true, then so also is P(m + 1). Hence it follows by induction that 
P(m) ts true for all m, which establishes our result. 


Further use of Definition 10-5 coupled with Theorem 10:3 makes it a 
straightforward matter to establish many other important and useful results 
concerning continuity. Typical of results that follow from such reasoning 
are that a complex polynomial 


P(z) = ao + a1z + aoz? ++ + + + 4,23} 


is continuous everywhere, whilst a complex rational function 
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R() Ξ fot az + az? + = +b amz™ 
bo + biz + bez? +--+ + + byzn 


is continuous everywhere except at the ἡ zeros of the denominator. 

It is interesting to give an alternative proof of the continuous nature of a 
polynomidl P(z). As z = x + iy, it follows that we may express P(z) in the 
form 


P(z) — Oi(x, y) So iQ2(x, y)s 


where Qi(x, y) and Qo(x, y) are real polynomial functions each with general 
terms of the form x*y! in which 5, 1 are either zero or positive integers. Now 
from the behaviour of real functions of two real variables we know that 
Q(x, y) and Qoe(x, y) must be continuous functions of x and y everywhere 
in the plane. 


However, if Ζι and zg are any two points with Ζι = x1 + iy, and z= 
x1 + Iya, then 


| P(Z2) — P(zi) | = | O1(xe, v2) — Qi(mr, γι) + i[Q2(x2, y2) — Oo(x1, y1)] | 
< | O12, v2) — Gira, yi) | + | Oo(x2, v2) — Golxi, γι) |. 


Now as Qi(x, y) and Qo(x, y) are continuous, it is true that 


lim Qi(xe, yo) = Qi(xi, v1) and lim Qo(xe, ve) = Qoa(x1, γι), 

LIT] Το --Ρ ΤΊ 

Y2-7Y1 y2--y1 
and so | P(zz) — Ρ(Ζ1) | may be made arbitrarily small by taking z» sufficiently 
Close to z1. This proves our assertion of the continuity of P(z) for all z, since 
Z1, Z2 were arbitrary points in the complex plane. 

Obvious extensions of the other continuity theorems proved for real 

variables are also possible and the most useful ones are summarized below 
without further proof. 


THEOREM 10-4 (continuity theorem for complex functions) If f(z) and 
g(z) are two complex functions each continuous at z =z, then 


(a) f(z) + g(z) is continuous at zo; 

(b) f(z) g(z) is continuous at zo; 

(c) /(z)/g(z) is continuous at zo provided g(zo) + 0; 

(4) if f(w) is continuous at w = "Ὁ. and w = g(z) is continuous at z = Zo, 
with wo = g(zo), then the composite function (function of a function) 
Slg(z)] is continuous at z = zp. 


It is, for example, condition (d) of this theorem that validates the assertion 
that (z2 + 3z + 2)3 is continuous everywhere. (Why ?) 
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10-4 Derivatives—Cauchy-Riemann equations 


Thus far the related concepts of a limit, a function, and of continuity 
have been successfully extended to include a function of a complex variable. 
It is now reasonable to attempt to generalize the notion of a derivative, and 
at this point we encounter a major dissimilarity between a function of a 
complex variable and a real valued function of two real variables. Indeed, 
whereas we have already seen that most real valued functions of two real 
variables are partially differentiable with respect to those variables, it will 
shortly be shown that the operation of differentiation can only be defined 
for a very special class of complex functions. Before discovering the exact 
nature of the restriction on a complex function if it is to be differentiable, 
we must extend our definition of a derivative in a manner compatible with 
the real variable case. 


DEFINITION 10-6 (derivative of a complex function) Let w = f(z) be 
defined in some neighbourhood of the point z = Zo and let | A | be sufficiently 
small for z = Ζο + A to lie within this neighbourhood. Then, if the difference 
quotient 


f(Z0 + ἢ) — f(Z0) 
h 


tends to the limit y as | | --» 0, we shall call y the derivative of f(z) at zo and 
will write either 


dw 
fo) =y οτ τ ee 

If this difference quotient has a limit for all points zo of some region in which 

w = f(z) 15 defined, then f(z) will be said to be differentiable in that region. 

The derivative, as a function of a general point z, will be denoted either by 

Γ΄ (2 or dw/dz. 


Alternatively expressed, this definition asserts that the complex number 
y is the derivative of w = f(z) at z = Zo if, for every e > 0, there exists a ὃ 
such that 


feo +h) — flzo) _ 


P γ᾽ --ε for|A| <6. 


Notice that although A is small when | /| is small, the condition | ἃ] > 0 
that is imposed in our definition of a derivative requires the limit defining 
the derivative to exist for all possible methods of approach of ἢ towards zero. 
This means that if the derivative is to exist, then it must be independent of 
the manner in which h — 0. This is a vitally important feature of the definition 
and one to which we shall return. 
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Example 10-11 Prove that if m > 0 is an integer and w = ζῇ, then 
dw 


—_— = nzn-l 


dz 


for all z. 


Solution Consider some point zo and form the difference quotient 


(zo +h)” -- χοῦ 
h 3 


where / is any complex number. Then by the binomial theorem 


(zo + hy” — zo” 


h 
_ 20" + πἤζοῖ ἱ + [n(n — 1)]/2! h2zor-2 - + + + Am — Ζρῖ 
— Ὄπ. eee em ae τ τ -ς 
and thus 
hy? — zo” — | 
arm _ nzo?-1 + me hgh + a e 8 + Ani, 


Now as | | — Oimplies ἡ — 0, taking the limit of this expression as | A | > 0 
we arrive at the derivative of the function w = z” at the point zo: 


hye — n 
Jim ἘΠ 5} Σὸν ne. 
h{—0 


Since the point Zo was arbitrary this result is true for all zo, and so the 
function is differentiable for all z and 


A more subtle argument shows that this result is, in fact, true for any value 
of and not just for ἡ a positive integer. 
Example 10°12 Prove that if w = sin z, then 


dw 
— = coszZz 


dz 


for all z. 


Solution Let zo be any value of z and form the difference quotient 


sin (Ζο + A) — sin zo 
Fos eS a 
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where ἢ is any complex number. Then using a familiar trigonometric identity 
we have 


sin(zo +h) το sinzp 51Π Zo cosh + cos Zo sin ἢ] — sin Zo 
h 7 h 


= COS Z (=) Sin Z [ΠῚ 
= : h : h 


Now w = f(z) will be differentiable if the limit of the right-hand side of this 
expression can be shown to exist. This is most easily done by utilizing the 
formal power series expansions for the sine and cosine functions, which show 
that 


ral mtyt fe! Lae ον a 
hh 31 5! = 315! ᾿ 
and | 
ποθ τ] τ ate aa ΟΣ 
hh 21 4! ee AN | 


It is clear from these that because | ἢ | — 0 implies ἢ — 0, then 


inh 1 - BY... 
lim (=) =1, {πὶ [--Ξ = 0. 
1. ἢ [210 h 

Returning to our problem, taking the limit of the difference quotient as 
| 4 | — 0 and using the above limits gives for the derivative of 1 = sin z at 
the point Zo the result 


(= (zo + ἢ) — sin Zo 
lim [ ---------------  - 


h = COS Zo. 


\k|—0 
Once again, as Zo was arbitrary, we have shown that w = sin z is differ- 
entiable for all z, so we may write 


d 
— (sin z) = COs Z. 
dz 
Alternative derivations of the two limits involved in this example are indicated 
in Problems 10-22 and 10-23. 


The following theorem is an obvious extension of Theorems 5:4 to 5:8 
relating to the real variable case. 


THEOREM 10-5 (rules of differentiation) If /, g are differentiable functions 
in some region, then throughout that region: 


d | d d 
@ SYete@=aZ+F 


ΤΡ; (Derivative of Sum); 
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(b) Ξ- (f(z) g(z)] = fe) - + g(z) δ (Derivative of Product); 

ἃ [ΧὩ] _ g@dfidz) — fledgidz) 

(c) rp FA = gy provided 5(2) #0 
(Derivative of Quotient); 


(d) < (flee = f'[g(z)]g"(z) or, by writing u = g(z), this takes the 


d df du 
form — Ξε -  .-- : 
orm ap {f{g(z]} a as (Chain Rule); 
(6) f(Z) and g(z) are continuous functions of z (Differentiability implies 
Continuity). 


Proof All these results may be established directly from the definition of a 
derivative by arguments that are essentially similar to the real variable case. 
We give the proofs of (a) and (e) as illustrations. 

Result (a) follows because 


d | h ἢ) — - 
£1) + καὶ = im [45:5 5 Ὁ εἰ Ὁ ἢ -- flo) - g00) 
[A{—0 


ἢ 
νὰν {7{2:5.Ἀ}). 71. . [g(z +h) -- »(2) 
= im | A | + im | h | 
«ὦ ἐς 
air ae 


Result (6) follows because differentiability of a function f(z) requires the 
difference quotient [f(z + h) — f(z)]/h to have a limit as | A | -- 0, which in 
turn requires that | f(z + h) — f(z) | > 0 as [ἃ — 0. This is just the formal 
statement that f(z) is continuous and so our assertion is proved. 


Example 10-13 Use the derivatives established so far together with Theorem 
10-5 to differentiate the functions: 

(a) w = z2 + 3sinz; 

(b) w = z3 sin z; 

(Cc) w= {(] + 2); 

(4) w = sin (z? + z + 3). 


Solution 
(a) Using Theorem 10:5 (a) with f(z) = z?, g(z) = 3 sin z we obtain 
ov = 22 + 3cosz 
dz 
for all z. 
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(b) Using Theorem 10-5 (Ὁ) with f(z) = z3, g(z) = sin z we obtain 
a = z3 cos 2 + 32% sinz 
dz 

for all z. 


(c) Using Theorem 10:5 (c) with f(z) = 1, g(z) = (1 + z) we obtain 
dw —| 
dz (142)? 

forz - 1. 


(4) Writing w = sinu, where u = z2 + z + 3 enables us to apply the 
chain rule (Theorem 10-5 (d)): 


d 
τῶ ἘΣ τ (sin ) = = (cos u)(2z + 1), 
whence 
dw’ 
—o (22 + 1) cos (Ζῇ + z + 3). 
z 


Let us now explore more carefully the implications of the requirements of 
differentiability. This is perhaps best prefaced by an illustration of a simple 
function of a complex variable that is not differentiable. 

We shall attempt to compute the derivative at zo = 0 of the function 
f(z) = Z, where z = x + iy. We have f(z) = x — iy, from which it follows 
that ΚΟ) = 0, so that in computing the required derivative we are led to 
consider the behaviour of the difference quotient 


fO+hA-f0O fh-0 A 


h h h 


as | A | — 0. Writing h = « + if this becomes 


cz ᾿ = = = Fry) 


h 
= 


Obviously this expression can have no limit as | ἡ | + 0 because the result is 
dependent on the manner of approach of ἢ to zero. To see this we need take 
only two special cases: 


(a) if « = 0, and β -- 0, then 


h 
lim (5) Ξε --Ἱ, 
αΞξεῷ h 
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whereas (b) if β = 0, «—> 0, then 


ἢ 
ji ~} = ], 
Pa (;) 


R=0 
The limit thus depends on the manner in which ἢ -- 0 so that no derivative 
exists in the sense of Definition 10-6. 

Obviously some conditions must be devised such that it is possible to 
decide, without appeal to Definition 10-6, whether or not a given function 
f(z) has a unique derivative—that is, whether or not the limit of the difference 
quotient in Definition 10-6 is independent of the manner in which ἢ —> Ὁ. 

Consider a function f(z), assumed to be differentiable in some region, 
and express it in the form 


f(z) =u w, (10-4) 
where u, v are functions of x and y by virtue of the relationship z = x + iy. 


(Cf. the illustrative examples (a) to (e) following Definition 10-3.) Let us now 
compute the derivative of f(z) and, in doing so, appeal to Fig. 10-3. 


Complex plane Fig. 10:3 Derivative of a complex function. 


As f(z) is assumed to be differentiable, we shall choose an arbitrary h as 
shown in the Figure and allow it to tend to zero along the line QP inclined 
at an angle « to the x-axis. Then if h = A + in, it follows that z +A = 
x + A+ i(y + p), and so if we also make use of the alternative representation 
of A in the form ἢ = | A | e, where | A | = (A2 + w2)1/2, we have 


Wo) = linn 4: ἢ -- ἢ . [ΠῸὰ Ὁ 3 +I + OI—S& + iy) 
ὡς ae a 
or 

(2) = enim 51 2.}}:Ὲ μὴ) WO +4 y + μὴ — Ul, y) — vl, 7, 
ae | 


(10-5) 
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As f(z) is assumed to be differentiable, result (10-5) must be independent of 
the angle «. To see the implications of this let us first consider the real part 
of the bracketed expression inside the limit (10-5) which is 


u(x + 4, y + μὴ — Ux, y) 
(A2 + 21/2 
By adding to and subtracting from the numerator of this expression the term 


u(x, y + 44), it is soon verified, with a little manipulation, that it is equivalent 
to 


(10-6) 


(“ +A y+ μὴ — ux, y + “ λ 
ἢ (A2 + μϑ) 


u(x, y a μ) as u(x, y) μ 
πὰ π΄. 


ΑΜ 
(10-7) 
Geometry tells us that 
A _ μ ᾿ 
G2 4 μὲ + ie = COS ἃ, Ga win ΣΙ = sin α 


so, when taken in conjunction with the fact that | 4 |—0 implies 4— 0, 
μ-- 0, the limit of expression (10-7) as | A | + 0 becomes 

Ο Ou 

id cosa + — sina. (10-8) 

Ox ey 

An identical argument applied to the imaginary part of the bracketed 
expression inside the limit (10-5) yields the result 

Ov 


1 Ov . 
oe COS ἃ By sin α΄. 


Hence, the limit (10-5) is equivalent to 


. 2 2 ow Ov 
fe =e” (= cos a + — sin «| +i (Ξ cos a + — sin 2), (10-9) 
Ox oy ey 


For Γ΄ (2) to be independent of the manner in which ἡ -+ 0, it follows that 
Eqn (10-9) must be independent of the value of «. In particular, the real and 
imaginary parts of this expression must be independent of «. Expressing 
f’(2) in real-imaginary form we obtain 

ς (2 Ε cos? « re sin? « + (= ++ = sin a COS “| 
oy Cy 6x 


δ 0 6 Ο 
+i Ε cos? a -- = in? a + Ἂ - = sin % COS a]. (10-10) 
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Inspection shows that it can only be independent of « if both the following 
conditions are satisfied: 


—=— and —=——. (10-11) 


These are known as the Cauchy-Riemann equations and are fundamental 
to the development of the theory of functions of a complex variable. An 
Immediate consequence of the Cauchy-Riemann equations is that Eqn 
(10-10) may be written either as 


Cu Ov 
f® ax + is (case « = 0) (10-12) 
or aS 
ἡ 5 
f@= ᾿ = rs (case α = 4n) (10-13) 


It has thus been established that if a function J(Z) is to have a uniquely 
determined derivative at a point in the sense of Definition 10-6, then it must 
satisfy the Cauchy-Riemann equations (10-11). 

We now check whether the converse—the satisfaction of the Cauchy-— 
Riemann equations by a function automatically implying that the function 
has a unique derivative—also holds. Let w = u + iv be a function such that 
u, v satisfy Eqns (10-11). Consider first the function u at some point z = 
x + iy, We know from Chapter 5 that at a neighbouring point z + A with 
h=A+ in, for Au=u(x +d, y + μὴ — u(x, y) we may substitute the 
expression 


Cu Cu 
Ξε - - λ 
Au oe oye + mip, 


where εἰ, 710 as A, μ-- Ὁ provided that uz and uy are continuous. A 
similar result is of course true for Av, the change in v consequent upon moving 
from z to z + ἢ, though for ¢1, 71 we must substitute ¢2, 72 and require that 
Uz, Vy are continuous. 


Thus if Af = f(z + A) — f(z), we have 

Ou Ou {ov Ov 
gt age +i(Pat en) + (614 + η1μ) + i(e2d + n2L). 
Using the Cauchy-Riemann equations this can be re-expressed as 


Af = (= + i=) ἢ + (ει + ἰξο)λ + (yr + ine), 
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whence 

Af ou. .év {4 τς (ΜΒ 

hora ti Tt (5) + (ns + in (*). (10-14) 
However, |A|< ||, | u|<|A]| so that 

A μ 

ἐπ δῶ . 

: =, Ε <1; 


and as &, €2, 71, and 7 all tend to zero as A, μ — 0, by taking the limit of 
Egn (10-14) as | A | — 0 we arrive at 


The fact that f(z) is assumed to satisfy the Cauchy-Riemann equations 
and to have continuous partial derivatives uz, uy, vz, and vy has thus enabled 
us to prove that f(z) has a unique derivative. We have established the follow- 
ing fundamental theorem. 


THEOREM 10:6 (Cauchy-Riemann theorem) If u(x, y) and v(x, y) have 
continuous first order partial derivatives in some region, then necessary and 
sufficient conditions that f(z) = u + iv should have a derivative at each point 
z = x + iy of that region are that 


Results (10-12) and (10-13) may be used to deduce the form of f’(z) by 
using the simple observation that when z is purely real, so that z = x, the 
forms assumed by /’(z) and /f’(x) are identical. Similarly, when z is purely 
imaginary, so that z = iy, the forms of f’(z) and f‘(iy) are identical. This 
gives the following straightforward rule for determining the derivative 
J’) of the function f(z) which 15 sometimes helpful. 


Rule 1 (Determination of the derivative of a complex function) 


If f(z) = u + iv satisfies the Cauchy—Riemann equations, then the derivative 
{(2) expressed in terms of z may either be deduced 


(a) from the result 
‘ Ou. ev 
Om T+ a 
by formally setting y = 0, and then replacing x by z; or 


(b) from the result 
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” Ov ou 
PO= τὰ" 
by formally setting x = 0, and then replacing iy by z. 
Example 10:14 Determine which of the foliowing functions satisfy the 


Cauchy-Riemann equations and thus possess uniquely defined derivatives. 
Give the form of this derivative when it is defined. 


(a) w= z?; 

(Ὁ) w = cos z; 

(0) w=|[zI. 
Solution 
(a) If w = z?, then w= (x + iy)? = x? — y2 + i2xp and so ἡ = x2 — y?, 
v= 2xy. SO uz = 2x, Uy = —2y, ve = 2y, and vy = 2x. It is readily seen 


that these expressions satisfy the Cauchy-Riemann equations and so we may 
conclude that w = Ζῇ possesses a unique derivative. It follows from Eqn 
(10-12) that 

Γ (Ὁ = 2x + i2y = 22. 
This result was so simple that appeal to Rule 1 was not necessary. 

(b) If w=cosz, then w= cos (x + iy) = cos x cos iy — sin x sin iy, 
when w = cos x cosh y — i sin x sinh y, and so u = cos x cosh y, 
v = — sin x sinh y. Hence, uz = — sin x cosh y, Uy = cos x cosh y, 
vz = — cos x sinh y and vy = — sin x cosh y. Here also it is immediately 
apparent that the expressions satisfy the Cauchy-Riemanh equations, 
showing that w = cos z possesses a unique derivative. 

Let us choose to work with Rule 1 (a) to determine Γ (2 in terms of Ζ. 
We must therefore start with the equation 


Oe ee 
| ax 
In this case we find 
7 (Ὁ Ξ- —sin x cosh y — icos x sinh y. 
Then, setting y = 0 and replacing x by z gives 


{(2) = — sinz. 
It is instructive to compare this rapid method with the direct approach 
we now indicate. 
7 (Ὁ = — sin x cosh y — icos x sinh y 
= — sin x cos iy — cos x sin iy 
= — sin(x + iy) = — sinz, 
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(c) If w=] z{, then w = (x? + y?)'/2, showing that u = (x2 + y2)1/2, 
v = 0. Then, as ug = x/(x? + y?)/2, μν = p/(x? + yp?)1/2, ve = vy = 0, it is 
clear that w= | z | cannot satisfy the Cauchy-Riemann equations anywhere 
in the complex plane. We conclude that 1 = | z | has no derivative at any 
point in the complex plane. 


Example 10:15 Determine the constants a and ὁ in order that 

w= χϑ + ay? — 2χγ + i(bx? — y? + 2xy) 
should satisfy the Cauchy—Riemann equations. Deduce the derivative of w. 
Solution Here we have u = x? + ay? — 2xy, v = bx? — y? + 2xy so that 
Uz = 2x — Ly, Uy = Zay — 2x, ve = 2bx + 2y, and vy = —2y + 2x. It is 
certainly true that uz = vy, so that the first of the Cauchy-Riemann equations 
is automatically satisfied. For the second equation to be satisfied we must 
require that uy = —vz, or 2αν — 2x = —(2bx + 2y). This is only possible 
ifa=—1,5 = 1. 

Now as f’(z) = uz + ivr, we have 

f(z) = 2x — 2y + (2x + 2y). 
Again, working with Rule 1 (a) gives 

f(z) = 24 + iz. 


Had we chosen to work with Rule | (Ὁ) to express f(z) in terms of z we should 
have started from the equation 


(Ὁ = vy — itty 

which in this case becomes 
Γ(Ὡ = —2y + 2x + i(2y + 2x). 

Then, setting x = Ὁ and this time replacing iy by z, we again arrive at 
f(2) = 20 +. dz. 


As the complex number z can also be expressed in modulus argument form 
by writing z = re’’, it is necessary to know the form taken by the Cauchy- 
Riemann equations in terms of the variables (r, 9). This is most readily 
achieved by appeal to Theorem 5:22. 


It follows directly from Theorem 5:22 that: 


Ou Or du 06 Cu Cr cu  c@ éu 
ὃ. ὄχ @r | ax 6 a Cr ὃν @6 


1 
Ὁ 


(10:15) 
Ov Or dv 086 Ov Cv cr cv eO ev 


ax ax dr ox ὅ8' ὃ By or dy 00 
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In these equations (r, 8) are the polar coordinates of the point (x, y) and so 
x=rcos#, y=rsin8@. 


(See Eqns (4-13).) 
These relationships may now be used to determine ¢r/ox, er/éy, 00/ex, 
00/ey as follows: 


_ x And ors 
ΓΤ Cos 8 ᾿ ax cos 
| 1 00 Ϊ 
οο5θ -- — whence —sn@—=- andso —=———; 
r ox r Ox r sin 0 
δ Ϊ 
r= = and so Se ee : 
sin 6 y sin @ 
86. 1 06 
sin 6 = x whence cos @— = - and so - Ξε oa 
r ὃν or ey rcosé 


Combination of these results with Eqns (10-15), followed by some simple 
manipulation, then establishes that the polar form of the Cauchy-Riemann 
equations is 


οι = 1 ev 1 Cu Ov | 


Functions f(z) that are uniquely defined in some neighbourhood of a 
point Zo and satisfy the Cauchy-Riemann equations at zo and throughout 
that neighbourhood are called either analytic or regular functions. Points at 
which a function ceases to be analytic are called singularities of the function. 
Thus the function f(z) = 1/(z + 1) is easily seen to be analytic everywhere 


except at the point z = —1, which is a singularity. 

Supposing that uzy, vzy exist and are continuous, it follows directly by 
partial differentiation of the Cauchy-Riemann equations uz = vy, Uy = —vz 
that 

Cu tu δὲν ay 

— — ἔπεσες 0 d — —_ -Ξ 0. l -] 

Ox® ὃγϑ3 μὲ Ox? * oy? ee 


These equations are identical in form and are examples of an important 
partial differential equation called Laplace’s equation, any solution of which 
is called a harmonic function. The harmonic functions u and νυ associated 
with an analytic function f(z) =u-+ iv are called conjugate harmonic 
functions. For example, we have seen that 


cos z = cos x cosh y — isin x sinh y 


is an analytic function with u = cos x cosh y, Ὁ = — sin x sinh y. Now both 
u and v are such that μων, vzy are continuous, so it follows immediately that 
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u and v satisfy Eqns (10:17). Hence u = cos x cosh γ, v = — sin x sinh y are 
conjugate harmonic functions. The term conjugate is, of course, used here 
in a different sense from when discussing complex conjugates. 

If u, v are harmonic functions and we consider the analytic function 
w =u + iv, then an obvious modification of the arguments that gave rise 
to Rule 1 leads to the following rule for the expression of w in terms of z. 


Rule 2 (Expression of an analytic function in terms of z) 
If u, v are conjugate harmonic functions, then the analytic function w = 
u + iv expressed in terms of z may be deduced either by: 
(a) formally setting y = 0 in the expression w = u + iv and then re- 
placing x by z; or 
(b) formally setting x = 0 in the expression w = u + iv and then re- 
placing iy by z. 


Example 10-16 Show that u = 2xy + 3y is harmonic and determine its 
harmonic conjugate v. Express the functions dw/dz and w = uw + iv in terms 
of z. 


Solution We have uz = 2y, usr = 0, Uy = 2x + 3, uyy = 0, showing that 

χα + Uyy = 0. Hence u is harmonic. If v is to be the harmonic conjugate of 

u then the functions u, v must satisfy the Cauchy-Riemann equations 

Uz = Vy, Uy = —Uz. 7 
Using the known expressions for uz, uy we find that 


(a) 2y = vy, and (Ὁ) 2x + 3 = —pz. 
Integration then gives: 
from (a), 

v = y? + f(x) + const, 
from (b), 

v= —x* — 3x + g(y) + const, 


where as yet f(x) is an arbitrary function of x and g(y) is an arbitrary function 
of y. However, as these are two alternative expressions for the same function 
v they must be identical, whence f(x) = —(x? + 3x) and g(y) = y?. Thus 
we have arrived at the expression 


v= y? — x? — 3x + const 


for the function v, which is the harmonic conjugate of uw. 
Applying Rule 1 (a) to find f(z) requires that we start from 


Cu Ov 
| CO ern ro 
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or, in this case, from 
f'@ = 2y — i(2x + 3). 
So, setting y = 0 and replacing x by z, gives 
{[(2 = —i(2z + 3). 
To express Μ᾽ 2 u + iv in terms of z we must work with Rule 2. We have 
w= (2xy + 3y) + iy? — x? — 3x) + const, 


so that if we apply Rule 2 (a), we must set y = 0 and replace x by z to arrive 
at 

w = —i(z? + 3z) + const. 

It is important to notice when using Rule 2 that the functions u and v 
must be conjugate harmonic functions, since otherwise they will not satisfy 
the Cauchy-Riemann equations and the rule will be inapplicable. Indeed, if 
the rule is applied to harmonic functions that are not conjugate, then the 
functions of z that are generated by Rules 2 (a) and 2 (b) may, or may not 
be identical. In neither case will the result be correct. For example, 


u=sinxcoshy and v = cos x cosh y 


are harmonic functions but they are not harmonic conjugates. Applying 
Rule 2 (a) to w = u + iv generates the function w = sin z + icos z, whereas 
applying Rule 2(b) generates the function w=icosz. For a different 
example, take u = x? — y* and v = xy, which are also harmonic functions 
that are not conjugate. In this case both Rules 2 (a) and 2 (b) generate the 
same function w = z?, though of course this also is incorrect. 


10-5 | Conformal mapping 


Thus far we have examined some of the analytical consequences of requiring 
that a function w = f(z) be differentiable. Let us now pursue this matter 
further by studying some of the geometrical implications of differentiability. 

Take two complex planes, which we shall refer to as the z-plane and the 
w-plane, the connection between their respective points being through the 
differentiable function w = f(z). Because each value of z gives rise to a unique 
value of w, it follows that any curve y in the z-plane must correspond to 
some other curve [° in the w-plane. In this sense the w-plane can correctly be 
described as a mapping of the z-plane. 

For a specific illustration, let us determine how the straight line y = «x 
in the z-plane is mapped by the function w = iz + (1 + ἢ onto the w-plane. 
We begin by setting w = u + iv, z = x + iy, after which a simple calculation 
yields u = 1 — y, v= x-+ 1. Hence to find the line in the w-plane that 
corresponds to y = «x in the z-plane it is now only necessary to set γ = ax 
in these expressions for u, v and then to eliminate x between them. Performing 
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(a) (b) 
Fig. 10-4 Mapping by the function w = iz + (1 + 3). 


these operations we find u = 1 — ax, v = x + 1, whence 


(=) I 
v= - - π. 


x x 


This is again an equation of a straight line but this time in the w-plane.. 
The line passes through the point (0, (1 + «)/«) and has the gradient —1/a. 
Representative lines y1, v2 are shown in the z-plane of Fig. 10-4 and their 
respective maps or images are shown as the lines 1, 15 in the associated 
w-plane. The lines 71, yz correspond, respectively, to a = 1, α = 2. 

It is not difficult to see that the map in the u-plane has been obtained from 
the map in the z-plane by first rotating the original pair of lines anti-clockwise 
through an angle ἐπ and then translating the resulting picture to the point 
1 + i as a new origin. More important than this, however, is the fact that 
the angle @ between the lines y1, y2 15 equal to the angle between the lines 
I';, 1 and, moreover, the sense of rotation is preserved. That is to say if ye 
is inclined to γι at an angle 6, measured anti-clockwise, then I's is also inclined 
to 1 an an angle 6, measured anti-clockwise. 

This is no chance result and, indeed, we now prove that if a function 
f(z) is analytic (that is, satisfies the Cauchy-Riemann equations and so has a 
uniquely defined derivative) then, except for points zo at which f’(zo) = 0, 
the function w = f(z) will preserve both the angle and the sense of rotation 
when mapping intersecting curves y1, y2 in the z-plane onto corresponding 
intersecting curves I"), [ in the w-plane. These properties of a mapping or 
transformation are recognized by saying that the transformation is conformal. 

To prove this general result we now consider a function w = f(z) that is 
analytic in some region of the z-plane and take a point Zo in that region at 
which f'(zo) τῇ 0. Let γι, ye be two curves drawn in the z-plane that intersect 
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(b) 


Fig. 10:5 Conformal! mapping w = f(z). 


at Zo and let Ζι denote a point Q on the curve 1 as indicated in Fig. 10.5. We 
shall suppose that as Q moves away from P along γι in the direction indicated 
by an arrow in the Figure, so the point w; = f(z1), which we denote by Q’, 
moves away from point P’ in the direction indicated. This process thus 
associates a sense of direction with each of the corresponding curves γι and 
I’;. A similar argument defines directions along yz and Is. 

Now as Q approaches P, so the secant PQ will assume its limiting position 
in which, when it is inclined at an angle «1 to the x-axis, it is tangent to 71 
at Zo. AS PQ = Ζ; — Zo we have 


αι = lim arg (z1 — Zo). 
21-20 


Identical reasoning shows that 


βι = lim arg ("1 — wo), 


Ζ1--"20 
where βὶ is the angle of the tangent to 1‘; at P’ measured from the u-axis. 
Hence we have 


| βι -- ΑἹ = lim arg (1 — Wo) — lim arg (21 = Zo) 
Z17>29 Z1—Zo 


and, as arg a — arg b = arg a/b, this may be written 
ἐ Wy — Ww 
βι — αἱ = lim arg (—"*} 
2120 Z1 — ZO. 


However, as we are assuming f(z) is differentiable 
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f'(zo) = lim [ΞΞΞ} 


zi—>z9 \ 21 — 20 
and provided f'(zo) τέ 0 it then follows that 
Bi — αἱ = arg Γ΄ (20). (10-18) 


In the case that f’(zo) = 0, the amplitude of f’(zo) is indeterminate. Such 
points are called critical points of f(z), by analogy with the real variable case. 

We have seen that f’(zo) is unique, so that the expression on the right- 
hand side of Eqn (10-18) is a constant. The result must, then, also be true for 
any other curve ye, say, and its map 1.2. Hence we have 


Bi — αἱ = fo — a 
or 
ag — αἱ = Pe — fr. 


The curves 1, y2 were any two curves which intersected at zo, so we have 
proved the following result. 


THEOREM 10-7 (conformal mapping) If f(z) is analytic in some region, 
then apart from those points Zo in that region for which f’(zo) = 0, the 
mapping w = f(z) preserves both the angle and the sense of rotation when 
mapping intersecting directed pairs of curves in the z-plane into corresponding 
intersecting directed pairs of curves in the w-plane. Such a mapping is said 
to be conformal. 


To close this chapter we now examine some important special conformal 
mappings. Rather than emphasize the algebraic details of the transformations 
or mappings, we shall aim primarily at interpretation in terms of basic 
geometrical operations such as translation, rotation, and change of scale 
(dilatation). 


10.5 (4) The general linear transformation 


The general linear transformation is the name given to the mapping described 
by the equation 


w = az + b, | (10-19) 


where a, δ are arbitrary constants with a γέ 0. Our introductory example was 
of this form with a = i, δ = 1 + 7, The mapping (10-19) obviously satisfies 
the Cauchy—Riemann equations and, as dw/dz = a £0, it has no critical 
points and so provides a conformal mapping of the entire z-plane. To 
appreciate the geometrical effect of this mapping consider first the case in 
which a = 1 so thatw=2z+ ὁ. 

This has the effect of generating the w-plane by simply adding a constant 
complex number ὁ to every point in the z-plane. Using the vectorial repre- 
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_ Sentation of complex numbers this is seen to be equivalent to generating the 
w-plane by shifting the entire z-plane through a distance | ὁ | parallel to the 
vector 6, Such a mapping is accordingly called a translation. Another way of 
expressing this result is by saying that if the w- and z-planes were to be 
Superimposed, then the O{w, v} axes would be obtained by translating the 
O{x, y} axes, without rotation, such that in their new position the origin 
coincided with the point z = —b. To see this, remember that ὁ is a vector 
and that the position vector of the origin of O{u, v} is ὁ relative to O{x, y}, 
but that the position vector of the origin of Οὐχ, y} relative to O{u, v} is —b. 
Consequently, we may conclude that the mapping w = z + ὁ leaves invariant 
the shape and size of any curve in the z-plane. 

Next we consider the consequences of setting b = 0 so that w = az. If 
we write a = pe and z = re”, we have w = pre“**+®, This shows that the 
effect on the z-plane of the mapping w = az is to multiply the modulus of z by 
a constant factor p and to increase the argument of z by a constant angle «. 
Hence w = az corresponds to a magnification, or dilatation, of every z by a 
constant factor | a |, and a rotation about the origin of every z by a constant 
angle «. Thus we may deduce that the general linear transformation 


w=az+b 


of the z-plane may be described geometrically as the combination of a 
dilatation, a rotation, and a translation. In the trivial case a = 1, ὁ = O the 
mapping reduces to an identity. 


10:5 (b) The mapping w = ζῇ 

A typical example of this form is provided by the function w = 22. As it is 
interesting to interpret mappings in terms of both polar coordinates and 
cartesian coordinates, let us first study the polar representation. To do this 
we set z = re’, w = pe’®, when we find 


p(cos ᾧ + isin d) = r%(cos 20 + isin 26), 


showing that p = r? and ¢ = 26 + 2n7, wheren =0,1,2,.. .. However, 
for our purposes we shall disregard this ambiguity of the angle ¢ with respect 
to multiples of 27, since all angles in polar coordinates are indeterminate in 
this manner. 

In words, the effect of the mapping w = z? is to square the modulus of 
every number z and to double its argument. This is very easily illustrated by 
appeal to Fig. 10-6 depicting the mapping of a shaded portion of an annular 
region in the z-plane into another, larger, annular region in the w-plane. The 
conformal nature of the mapping is reflected by the fact that at the corres- 
ponding corners of the figures the angles between the boundary lines together 
with their senses have been preserved. They are of course equal to ἐπ in this 
instance. 

Because of the properties just outlined it is readily seen that the function 
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(Ὁ) 


Fig. 10.6 The polar mapping w = z?. 


w = z* maps the upper half z-plane onto the entire w-plane. When this is 
done it is necessary to exclude the origin in the w-plane together with all the 
points on the positive w-axis, since these are mapped twice. In fact they 
correspond to points on both the positive and negative parts of the real axis 
in the z-plane. The origin in the w-plane is in fact a critical point, for w’ = 2Ζ 
vanishes at z = 0. This exclusion of a line of points in the w-plane is often 
described by saying that the w-plane has been cut along the real axis. 

The effect of the mapping is more striking if it is displayed in terms of 
x and y by again setting w = u + iv, but this time writing z = x + iy to 
obtain u = x2 — y?, v = 2xy. These equations show, for example, that the 
straight line x = « maps into the curve ἡ = αϑ — y?, vy = 2ay in the w-plane 
which, after elimination of y, is seen to be equivalent to v? = 4«2(a2 — u). 
Similarly, the straight line y = 8 may be seen to map into the curve v? = 
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Fig. 1057 The Cartesian mapping w = z?. 
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467(62 + u) in the w-plane. These equations describe two parabolas that are 
symmetrical about the w-axis, as shown in Fig. 10-7. 

The lines x = 1, y = 3/2 denoted by y1 and ye, respectively, in the z-plane 
map into the parabolas [1 and I’2 in the w-plane. This shows that the single 
point z = | + 3i/2 denoted by P in the z-plane (that is, the point (1, 3/2)) 
maps into the pair of points P’ and P” in the w-plane determined by the two 
points of intersection of parabolas I‘; and Ve. Again the conformal nature 
of the transformation 15 reflected in the easily checked geometrical fact that 
the two families of parabolas are mutually orthogonal, as are the lines 
x = const, y = const in the z-plane. 

The more general mapping w = 2” may be analysed in similar fashion, 
though the algebraic complexity is naturally greater. When ἢ is integral the 
mapping may be seen to transform the segment 0 < arg z < 27/n into the 
complete w-plane with a suitable cut along the u-axis. (Care must be exercised 
when n is fractional for then the mapping is many valued. We shall not 
pursue this matter further.) 


10-5 (ὦ) The inversion w = 1/z 


For obvious reasons the mapping w = 1/z is called the inversion mapping. 
Its geometrical effect may be deduced by setting w = pe, z = re” to find 


l 
p(cos Φ + isin¢d) = i (cos 6 — isin 8). 


Arguing as with the function w = z?, we then see that this implies that 
p=Il/r,¢ = —0. 

Expressed in words, the inversion mapping w = 1/z transforms a point 
in the z-plane with modulus r and argument @ into a point in the w-plane 
with modulus I/r and argument —6. This may be interpreted geometrically 
by appeal to Fig. 10-8 in which the w- and z-planes are shown superimposed 
with a common origin, and P is any point in the z-plane with P’ denoting its 
image in the w-plane. 

The circle shown in Fig. 10-8 is the unit circle | z | = 1, and point Q on 
the radius vector drawn from O to P is such that OP .OQ = 1. Hence if 
OP = r, then OQ = 1/r. In geometrical terms point Q is said to have been 
obtained by inverting point P with respect to the unit circle. Point P’, which 
is the image in the w-plane of the point P in the z-plane, is then obtained by 
reflecting Q in the x-axis. — 

Thus the mapping w = {{2 corresponds to the inversion of points z with 
respect to the unit circle, followed by their reflection in the real axis. The 
inversion mapping thus maps the points interior to the unit circle about the 
origin of the z-plane onto the exterior of the unit circle about the origin of 
the w-plane, and vice-versa. The two unit circles map onto one another. 

Algebraically, we write w = u + iv, z = x + iy, when 
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Fig. 10:8 Inversion in unit circle followed by reflection in the x-axis. 


pee ee 
x2 py? ~ x2 4 y2 


To learn how the line x = « in the z-plane maps onto the w-plane we need 


only set x = α in the expressions for u and v and then eliminate y to obtain 
the equation 


Similarly, the line y = f in the z-plane maps onto the curve in the w-plane 
defined by the equation 


v 
μι p24 — = 0, 
β 


When these equations are rewritten in the form 


1 \2 i 1 \2 ; 1 \? 12 
ΠῚ 
it is easily seen that the line x = « in the z-plane has for its image in the w- 
plane a circle of radius 4« with its centre at (4a, 0), whilst the line y=fin 
the z-plane has for its image in the w-plane a circle of radius 3 with its 


centre at (0, — 38). We may conclude that lines parallel to the x- and y-axes 
map onto circles in the w-plane which pass through the origin and have their 
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centres on the μ- and v-axes. 

Had the general straight line y = mx + c in the z-plane been mapped, 
then this same form of argument would have shown that any such line not 
passing through the origin will transform into a circle through the origin in 
the w-plane. Lines through the origin in the z-plane transform into lines 
through the origin in the w-plane. The verification of these remarks is left 
as an exercise for the reader. 


10-5 (d) The bilinear transformation 
Any mapping of the general form 


_az+b 
"τ +d 
is called a bilinear transformation or a linear fractional transformation. The 
general linear transformation and the inversion mapping are special cases of 
the bilinear transformation. We now show that bilinear transformations are 
characterized by the property that they map circles and straight lines in the 
z-plane onto circles and straight lines in the w-plane, though not necessarily 
in this order. 
Let us now write the transformation (10-21) in the form 


_a ad — be 
ee ez + (djc)] | 
We assume c + 0 and ad — bc Ξέ 0; this is justified since if c = 0 the trans- 


formation reduces to the general linear transformation, whereas if ad — bc 
= 0, then w reduces to a constant. So, if we define new variables z; and ze by 


(10:21) 


(10-22) 


d l 
1=Zz+- Ζὸ Ξξ — (10-23) 
Cc Z1 
then (10-22) becomes 
d—b 
Cesc (“-- ἢ zo. (10-24) 
C ς 


We must now consider the sequential effect of the mappings that trans- 
form from the z-plane to the w-plane via the intermediate planes z, and zp. 
The mapping from the z-plane to the z:-plane is a pure translation and thus 
leaves the shape and size of all curves invariant. The mapping from the 
zi-plane to the ze-plane is an inversion and, as we have just seen, maps 
straight lines not passing through the origin onto circles, and straight lines 
through the origin onto straight lines. Finally, the mapping from the ze-plane 
to the w-plane is a general linear transformation and so comprises a rotation 
and a translation. Hence, in particular, this final mapping will transform 
straight lines into straight lines and circles into circles. This justifies our 
earlier statement that the bilinear transformation maps straight lines and 
circles into straight lines and circles, though not necessarily in this order. 
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Example 10-17 Find the image in the w-plane of the circle | z | = 2 if 


wu, 


Ζ-- 
z+i 


W τε 


Solution Setting w = u + iv, z = x + iy we find that 
x? 4+ y2— ] —2x 

= -----  -  ““- v= -----ττοττοτοττε-----------  -, 

᾿ ΧΟ y? + 2y +] Χο) ΟΡ Ων»: 


Now the circle | z| = 2 has the equation x? + γ = 4, which used in the 
expressions for uw, v gives 


i 3 _ —2x 
“245 ° Qed 


Next, solving these for x and y, we find 


ες —3v i ( 5} 
oe 2u- DONG 
so that on the required circle x? + y? = 4 this pair of equations is equivalent 
to 


3(u? + v2?) — 10u + 3 = 0. 
When this equation is expressed in the form 
( >) τὰ 16 
u— yt = — 
3 9 


it can be recognized as the equation of a circle in the w-plane having a 
radius of 4/3 and its centre at the point (5/3, 0). 

This conclusion could have been obtained more easily by using the 
following argument. The equation 


Ζ-- | 
w= 
Z+1 
iS equivalent to 
(=) 
Z=i] . 
l—w 


Hence, as ΖΖ = x? + y?, we have 
τ )(-4)-- ee | 


᾿ ἜΝ = 
δ. i( {12 Ὁ l—w l—w—w-+ ww 
In terms of w = u + iv, W = u — iv this becomes | 


1+ 2u + μὲ 4 v? 
2 ee τ στ 
μὰ, Ι — 2u - u? + v? 
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and, on the circle x? + y? = 4, it reduces our previous result 


3(u2 + v2) — 10u + 3 = 0. 


10:6 Applications of conformal mapping 


In any first account of the theory of conformal mapping, it is impossible to 
do more than merely indicate its application in science and engineering. 
From the fields of elasticity, electromagnetic theory, fluid mechanics, and 
heat conduction in which these ideas play important roles, we choose just one 
simple example. Our choice, from fluid mechanics, is solving the problem of 
the two-dimensional flow of an incompressible fluid around the interior of a 
wedge shaped region, on the assumption that the flow has a special property 
which enables it to be classified as being irrotational. These are in fact con- 
ditions which are usually valid in most low speed flows of ordinary fluids. 

In books on fluid mechanics it is established that if g1 and gz are the x 
and y components of velocity at a point in an incompressible inviscid fluid 
that is undergoing two-dimensional flow, then under the stated conditions 
these components may be written in the form 


qi ΞΘ. -τ ον 42 = ia (10-25) 


where (x, y) is a function called the velocity potential of the flow. The lines 
d(x, y) = constant are called equipotentials. Using the vector interpretation 
of complex numbers we may thus represent the fluid velocity g by the complex 
variable 


ae (10-26) 


It can also be established that if fluid is neither created nor lost within the 
flow region, then φίχ, y) must be such that 


09 88 
ae ae = (), (10-27) 
Ox? éy* 


Thus ¢ satisfies Laplace’s equation and so is harmonic. Introducing the 
harmonic conjugate of ¢, which we shall denote by y(x, y), enables us to 
define a further complex variable F(z) by the equation 


F(z) = G(x, y) + ip(s, y). (10-28) 


This is called the complex potential and (x, y) itself is called the stream 
function of the flow. Now by the nature of the construction of F(z), it is 
differentiable in the sense of Definition 10-6 and so satisfies the Cauchy- 
Riemann equations. Hence 


bz = Ym dy ΞΘ πε. 
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or, in terms of g and qo, 

g=¢ze=Yy 92= by = —yz. | (10-29) 
These relationships provide the justification for the name stream function, 
for they show that the velocity vector is everywhere normal to the curves 


φίλ, y) = const. This follows because on ¢(x, y) = const, dadx + φνάν = 0 
showing that 

ΘΗΝ te 

dx ᾧν 
Hence if n is the gradient of the normal to a curve d(x, y) = const, then 
n(dy/dx) = —1, whence n = ¢,/¢z. However, from results (10-29) this is 
equivalent to n = q2/qi, which is the slope of the curve traced by a fluid 
particle. Hence the curves y(x, y) = const are curves along which fluid flows 
and so can properly be called streamlines. 

Consider the complex potential 


F(z) = Voz, (10-30) 
where ὕ Is a positive real number. Then we have at once 
p= Uox, y= Uoy. (10-31) 
The streamlines y = « are thus the lines y = «/Uo, and the velocity q is 
Od od 
=—+i—= Up. 
7 Ox ὌΝ oy : 


Thus the complex potential F(z) = Uoz must characterize a uniform flow, 
with velocity Uo parallel to the x-axis and directed in the sense of increasing 
x. This is illustrated in Fig. 10-9 (a). 


ὕὔ Z-plane 


(a) (b) 


Fig. 10:9 Transformation of fluid flow: (a) uniform flow in upper half plane; 
(b) flow inside wedge. 
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Now if we consider the transformation 
w= z1/3, (10-32) 
then we know from the arguments used in connection with the mapping 


w = z” that it will map the upper half of the z-plane onto the wedge 0 < arg 


w << ἐπ in the w-plane. 
Then, as (10-32) is equivalent to z = w3, we must have 


x + iy = (UW — 3uv?) + ἰ(ϑμδυ — v3) 


giving 

χ τε μ8 -- 3μυ5, 3 y = 3u2v — 0, (10-33) 
Hence the velocity potential is 

φ = σο(ιῶ — 3uv?) (10-34) 
and the stream function 

ψ = Uo(3u?v — υϑ). (10-35) 


Thus the curves y = const define the streamlines inside the wedge shaped 
region, and some representative streamlines are shown in Fig. 10-9 (b). To 
determine the speed at any point within the wedge we use the fact that 


oy : 
το, Ἢ ἰπ- ΞΞ 4ι — 92, 
showing that the speed | q | is given by 
dF} 


dz | 
As the complex potential is 


F(z) = Uow3, (10-37) 


[4] Ξ (10-36) 


we have 


dF 
— = 3Upw? 
dz a 


and, finally, 
1q.| = 3Uo0 | w? | = | + iv)? | = μϑ + 0? (10-38) 


Thus at a point P with coordinates (uo, vo) within the wedge, the speed 
| q | = uo? + vo”. The streamline through the point P is provided by Eqn 
(10-35), for the constant associated with this streamline through P must be 
3u92v9 — vo%, so that the streamline itself has the equation 


3u2n — v? = 3u02v0 — Vo?. 
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As mentioned at the beginning of this section, conformal mapping has 
many other applications, all related to solutions of Laplace’s equation in 
two dimensions. The application described here can provide no more than 
an indication of one of these situations. 


PROBLEMS 


Section 10-1 


10-1 Test the following sequences {zn} for convergence and, where appropriate, 
find the limit γ stating whether or not it is a member of the sequence. 


(a) Zn = 2" + i3-*; 


3 ae 
(Ὁ) Zz, = ntan- + insin-; 
n n 


n+ (—1)” 7 
(c) Zn = ᾿--(-ἰ + 4i/; 
Cee ( . a * i(1 + =) 
n n 


1 . {na 1 ‘nw 
(6) 2, = ρ' sin (7) + 7 “95 (=) 


10:2 Give examples of: 


(a) a non-convergent sequence {zn}; 
(b) a convergent sequence {zn} with limit 2 + 33. 


10:3 Given that the sequences {wn}, {zn} are defined by 


1 ; 2n sO es n2 
m= (14) +8(55 9] and em nsin> Ἐ ἐ{ππ 1} 


find the limits of the sequences {wn + zn}, {wazn} and {Wn]|Zn}. 


10-4 Identify the limit points of the sequence {zn} where 


ς n2+ 1 ᾿ nw 
en = din ( Ὲ }" Ὁ tos [τ 550} 


10.5 The general term of the sequence {zn} is 


a πα 2n2 + 1 ewe na 
sie 3n2 + 2n + 3 as eee 


Find values of « for which {zn} has: 

(a) one limit point, 

(b) two limit points, 

and state their location. Are the values of « unique? 


10-6 Construct examples of a sequence {zn} which has: 
(a) two limit points; 
(b) three limit points; 
(c) no limit points. 
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Section 10-2 
10-7 Sketch each of the following curves defined in the complex plane: 
(a) x=s,y=V(l1—s*) for -l<s<l; 
(b) x =asins,y=bcoss for 0<s < 2z a, Ὁ real); 
(c) x =coshs, y=sinhs for -—x<s< @; 
(ὦ [2:2 -- ἰ| ΞΞ 3; 
(6) ΖΖ = 4. 


Sketch the region defined by each of the following sets of inequalities and 
indicate when the boundary points belong to the region so defined. 


10-8 Im(z + iz) ΞΘ and Rez>0O. 
109 2<|z|<3 with 0 <argz < ἐσ. 
10:10 1<|z—1|<2 and 1<|z+1| <2. 
10-11 Sketch the region that lies inside the curve defined by 
arg (z + 2) — arg (z + 3) = ἀπ 
and is such that Im z > }. 
Give an alternative representation of this region. 
10:12 Draw the curve C defined by 
arg (z — i) — arg (z — 1) = ἐπ. 


Problem 10-13 


10-13 Define the figure-eight-shaped curve shown in the diagram in terms of argu- 
ments of complex numbers. The curves Ci and C2 are arcs of circles with 
centres Οἱ and Oz, respectively. 


10:14 Sketch a simply shaped region in the complex plane and define it: 


(a) parametrically; 
(b) directly in terms of z. 
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Section 10-3 


10-15 For what values of z are the following complex functions defined: 


(a) w= 22+ iz+1; (b) w = (z — 1)/(z — 2); 
(c) (2 + IM — ip(z? + 4); (d) w = sinh z. 
10-16 If f(z) = u + iv, find the expressions for the functions u, v in terms of x,y 
given that: 
(a) fiz) = 2 + 22 +1; (b) f(z) = 244. 
F z+ 2’ 
(c) f(z) = cosh z; (d) f(z) = cos z. 


10-17 Given the following forms of f(z) deduce their value if z = 1 + 2]: 
(a) f(z) = χϑ + 3xy + iy?; 


x? + 2iy +1. 
(b) f(2) = π΄ τ 


(c) f(z) = sin > (x? — iy?) + icos = (x? + iy?), 


10:18 Use Definition 10-4 to prove 
lim (2z? — 1) = —(1 + 4)i). 


z>1-7 
10-19 Use Definition 10-4 to prove 
422 — ἢ 


lt ~~ | = —-6. 
eas = + 3 


10:20 Use Definition 10-4 to prove 
_ (2 — iz(z? — 1) 
lim ———-—_—— 
z—1 (z -- 1) 


= 2(2 — i). 

10:21 Given that f(z) = z? + z — 2, e(z) = z + 2 deduce: 
(a) lim [f(z) + 22(2)]; 
(b) lim f(z) g@); 


(c) lim fe). 


2>i—2 2(Z) 
10:22 Prove that 


by considering 


; Z sin Ζ 
lim - ) 
| Ζ | —0 ( Zz ; 
writing z = x + iy, and then arriving at the result by displaying the function 
whose limit is to be considered in terms of its real and imaginary parts. 
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Deduce that 


: sin az 
lim = a, 
[2 [|--Ὸ Ζ 


where a may be a complex number. 
10-23 Use the result 


μα (282) = 
jzl-o \ 2 


\ 


established in Problem 10-22 above together with the identity 
5 Ἴ 1 — cos? z 
02) Se 
Ζ 2 


to prove that 


lim ‘ore, - ἢ; 
|z|-+0 Ζ 


10:24 For what value of « is the function 
1+ 3z for zi 
f(z) = | 


α for z= i, 
continuous at z = i. 


10:25 Give an example of a function f(z) that: 
(a) is continuous everywhere; 
(Ὁ) has a limit 3 + 27 as z— 1 + i, but is not continuous at z = 1 +7. 


10:26 Use Definition 10:5 to give a direct proof that f(z) = z? is continuous 
everywhere. 


10:27 Use the trigonometric identity 


. . Z + Zo . {Z— Zo 
sin 2 — sin 20 = 20s ( 5 ) «sin ( 5 ) 


and the last result of Problem 10-22 above to give a direct proof that f(z) = 
sin z is continuous for all z. 


10:28 Give reasons to justify the assertion that 
f(z) = zsin (2? + 3z + 2) + If@+2—i) 
is continuous everywhere except at z = —2 + i. 
Section 10-4 
10:29 Use Definition 10-6 to prove that if w = az?, where a is any constant, then 


for all z. 
10-30 Use Definition 10-6 to prove that if f(z) is a differentiable function of z in 
some region, then in that region 


: Ξ- ΣᾺ 
q Ol =f@) +25, 


10:31 


10-32 


10°33 


10-34 


10-35 


10:36 


10:37 
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By using the series representation of the hyperbolic sine function prove that 


Then, using the identity 
sinh Ζι — sinh ze = 2 sinh [(Ζι — 22)/2] cosh [(z1 + z2)/2), 


which may be derived directly from identity (6:29), show by means of 
Definition 10-6 that if w = sinh z, then 
dw 


a = coshz 


for all z. 


Show by means of Definition 10-6 that the function f(z) = | z | is not differ- 
entiable at the origin. Find the limiting value assumed by the difference 
quotient at the origin (that is, with zo = 0) as h + Ο along the line y = Ax. 


Determine which of the following functions f(z) satisfy the Cauchy-Riemann 
equations: 

(a) f(z) = ζϑ — iz? + 3; 

(Ὁ) f(z) = cosh (z + 37); 

(c) f(z) = zsinz + 22; 

(d) f(z) = (8 — 3xy?) + i3x2y — y3); 

(e) f(z) = χί + 2)/2; 

(ἢ) f(z) = sinh 3x cos y + i cosh 3x sin y. 


Find the points, if any, at which the following functions are not analytic: 
(a) f(z) = 3z + sinh z; , 
(b) f(z) = 2/(z + 2); 
(c) f(z) = cos I/z; 
sin Ζ 
(6) Cae rae 
Find the values of the constants a and δ in order that the functions w should 
satisfy the Cauchy-Riemann equations: 
(a) w = asin x cosh by + i2 cos x sinh y; 
(b) w = x8 — axy? -- x + 1 + ix? — by? — 1). 


Using the method outlined in the text, show that if x = rcos 6,y =rsin 0, 
then the polar form of the Cauchy—Riemann equations is: 


ὃν 1 @ér 1 ou Ov 
oo δ Oe 
Determine which of the following functions f(z) satisfy the Cauchy-Riemann 
equations: 
(a) w = (r?2 cos? 6 + 2) + ir? sin? @; 
(0) w = (r? cos 36 + 2rcos 6 + 4) + i(r3 sin 30 + rsin 6): 


(c) w= (r+ 7) cos 04a (r—2) sin 


(d) w = r2 cos? 6 + ir? sin? 6 + 4: 
(6) w = sin (7 cos 4). cosh (r sin θ) + i cos (r cos @) . sinh (r sin 6). 
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10-38 


10-39 


10:40 


10-41 


10-42 


Find the values of the constants a, ὃ, and c in order that the following 
functions should satisfy the Cauchy-Riemann equations: 

(a) w = alogr + (0 + dr); 

(b) w = r@ cos $6 + ibr® sin 38. 


Verify that the following functions w satisfy the Cauchy—-Riemann equations 
and in each case express the derivative of w as a function of z: 
(a) w = (x3 — 3χγ3 + y) + 13x?y — y3 — x); 
(b) w = (x sinh x cos y — y cosh x sin y) + i(y sinh x cos y 

+ x cosh x sin y); 
(c) w = e* (cos ay + isin ay). 


Find which of the following pairs of functions are harmonic conjugates. 
Deduce the representation of w = u + iv in terms of z for the pairs that are 
harmonic conjugates, first by using Rule 2 (a), and then by using Rule 2 (b): 


(a) w= x2? — y2+ 2y, v = 2x—y — 1); 


(b) uw = sin x cosh γ, v = cos x sinh y; 
(c) u = x sin x cosh y — ycos x sinh y, 
v = —(xcos x sinh y + y sin x cosh y); 
(d) u = sinh x cos y, v = cosh x sin γ. 
Show by differentiation that v = x2 — y* + 2y is harmonic and deduce its 


harmonic conjugate u. Express the function w = u + iv, and its derivative, 
in terms of z. 


Show by differentiation that u = cosh x cos y is harmonic and deduce its 
harmonic conjugate v. Express the function w = u + iv, and its derivative, 
in terms of z. 


Section 10-5 


10°43 


10-44 


10:45 


10:46 


10-47 


Sketch the images in the w-plane of the line y = 2x — 1 in the z-plane that 
result from the mappings: 


(a) w= iz—(2+ ἢ; 


(Ὁ) ν = 22- 3; 
(c)w=1+/)z+ 1. 
Determine the images in the w-plane of the circle | z — 1 | = 1 in the z-plane 


that result from the mappings: 

(a) w= 32 -- ἰ; 

(0) w= (ἰ — )Ζ + 2. 

In each case shade the regions in the w-plane that correspond to the interior 
of the circle | z— 1] = 1. 


Sketch the region in the w-plane corresponding to the region x > 2, y < xin 
the z-plane given that 

w= (2i-1)z+ (1 +2). 
Determine the equation of the line in the w-plane which is the image of the 
line x = 1 in the z-plane under the mapping 

w = 23, 
Give an algebraic proof that if c 4 0, then the general straight line y = mx 


+ c in the z-plane is mapped by the transformation w = 1/z onto a circle 
in the w-plane. 
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10-48 Find the image in the w-plane of the circle | z | = 2 if 
22 +i 


Ζ -- 31 


w= 


10-49 Show that w = e* maps the straight lines y = const in the z-plane onto 
straight lines through the origin in the w-plane, and the straight lines x = 
const in the z-plane onto circles about the origin in the w-plane. 


10:50 Locate the critical points of w = sinz and show that it maps the region 
—}n0 <x < ἐπ, y > 0 in the z-plane onto the upper-half of the w-plane. 


Section 10-6 


z —plane 


Problem 10-51 


10-51 Using the argument given in the text, show how the complex potential 
F(z) = Uoz and the mapping w = z!/? may be used to find the streamlines 
indicated in the figure. 


Find the speed of flow at a point P with coordinates (uo, vo) and determine 
the streamline dnd the equipotential through P. 


Scalars, vectors, and 
fields 


11:1 Curves in space 


If the coordinates (x, ν, z) of a point P in space are described by 


x=f(), y=g2(),  z=A(0), (Ld) 


where f, g, A are continuous functions of ἡ, then as ¢ increases so the point P 
moves in space tracing out some curve. It follows that Eqns (11-1) represent 
a parametric description of a curve I in space and, furthermore, that they 
define a direction along the curve I’ corresponding to the direction in which 
P moves as ἢ increases. For example, the parametric equations 


x = 2 cos 2π|, y = 2 sin 2πί, IOs 


for Ὁ -Ξ t< 1 describe one turn of a helix, as may be seen by noticing that 
the projection of the point P on the (x, y)-plane traces one revolution of the 
circle x? + y? = 4 ἃς 1 increases from t = 0 to tf = 1, whilst the z-coordinate 
of P steadily increases from z = 0 to z = 2. 

If we now denote by r the position vector OP of a point P on I’ relative 
to the origin O of our coordinate system, and introduce the triad of ortho- 
gonal unit vectors i, j, k used in Chapter 4, it follows that (Fig. 11-1) 


= f()i + g(tj + ACK. (11-2) 


Expressions of this form are called vector functions of one real variable, 
in which the dependence on the parameter 1 is often displayed concisely by 
writing r Ξε τί). The name vector function arises because r 15 certainly a 
vector and, as it depends on the real independent variable f, it must also be a 
function in the sense that to each rf there corresponds a vector r(r). Knowledge 
of the vector function r(t) implies knowledge of the three scalar functions 
Ff, g, and A, and conversely. 

The geometrical analogy used here to interpret a general vector function 
r(f) is particularly valuable in dynamics where the point P(t) with position 
vector r(f) usually represents a moving particle, and the curve I’ its trajectory 
in space. Under these conditions it is frequently most convenient if the 
parameter ¢ is identified with the time, though in some circumstances identi- 
fication with the distance s to P measured along I from some fixed point on 
I is preferable. Useful though these geometrical and dynamical analogies are, 
we shall in the main use them only to help further our understanding of 
general vector functions. 
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Fig. 111. Vector function of one variable interpreted as a Curve in space. 


The name vector function suggests, correctly, that it is possible to give 
satisfactory meanings to the terms limit, continuity, and derivative when 
applied to r(t). As in the ordinary calculus, the key concept is that of a limit. 
Intuitively the idea of a limit is clear: when we say u(f) tends to a limit v as 
t-» to, we mean that when f is close to to, the vector function u(f) is in some 
sense close to the vector vy. In what sense though can the two vectors u(t) 
and v be said to be close to one another? Ultimately, all that is necessary is 
to interpret this as meaning that | u(t) — v | is small. 

So, we shall say that u(t) tends to the limit v as t— fo if, by taking ¢ 
sufficiently close to fo, it is possible to make | u(t) — v | arbitrarily small. As 
with our previous notion of continuity we shall then say that u(t) is continuous 


at to if lim u(t) = v and, in addition, u(to) = v. We incorporate these ideas 
t-—>fg 
into a formal definition as follows: 


DEFINITION [11] (vector functions—limits and continuity) Let u(¢) = 
u(t) + uo(t)} + u3(t)k and v = vyi + vej + vsk, then if for any e > 0 there 
is some number ὃ such that 


|u(t)—v|<e¢ — when |t—to| <0, 
we shall say that u(t) tends to the limit v as ἢ > fo, and write 


lim u(t) = υ. 
{to 
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If in addition u(¢ 9) = v, then u(r) will be said to be continuous at ὦ = fo. A 
vector function that is continuous at all points in the interval a< t < δ will 
be said to be continuous throughout that interval. 


As usual, a vector function that is not continuous at tf = fo will be said 
to be discontinuous. It is obvious from this definition that u(t) can only 
tend to the limit v as ¢ > fo if the limit of each component of u(r) is equal to 
the corresponding component of the vector vy. Thus the limit of a vector 
function of one variable is directly related to the limits of the three scalar 
functions of one variable μι(), wa(t), and ua(t). This is proved by writing 


| u(t) — v | = [(ua(t) — v1)? + (ue(t) — v2)? + (ua) — ν3)5]1Π5, 
showing that | u(t) — v| < ε as ἢ — fo is only possible if 
lim (u(t)— v,) =0 fori = 1, 2, 3, 


t—-tg 

or 
lim w(t) = σι, lim uo(t) = ve, lim u3(t) = vs. 
tt t—lo {lg 


A systematic application of these arguments enables the following theorem 
to be proved. 


THEOREM 11-1! (continuous vector functions) If the vector functions 
u(t), v(t) are defined and continuous throughout the interval a< t < 4, then 
the vector functions u(t) + v(t), u(t) x v(t), and the scalar function u(t) . v(t) 
are also defined and continuous throughout that same interval. 


Example 111 At what points are the vector functions u(¢), v(t) discontinuous 
if 


l 
u(t) = sin fi + sec tj + a k, 


v(t) = H+ (1 + £)j + εἰκ. 


Verify by direct calculation that u(t) + v(t), u(t). v(t), and u(¢) x v(t) are 
continuous functions in any interval not containing a point of discontinuity 
of u(t) or v(t). 


Solution Theicomponent of u(¢) is defined and continuous for all t, whereas 
the j component is discontinuous for ¢ = (2n + 1)ὲπ with n = 0, +1, +2, 

. and the k component is discontinuous for the single value ¢ = 1. All 
three components of v(t) are continuous for all t. We have by vector addition 


1 
u(t) + v(t) = (tf + sin δὲ + (1 + + sec dj + (- + —) k, 
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showing that the components of u(r) + v(t) give rise to the same points of 
discontinuity as the function u(t). We may thus conclude that the vector sum 
is continuous throughout any interval not containing one of these points. 
For example, u(t) + v(t) is continuous in both the open interval (ἐπ, 37/2) 
and the closed interval [5, 7] but it is discontinuous in (0, z). 

The scalar product u(t) . v(7) is given by 


t 
u(t). v(t) = fsint + (1 + 15) sect + G=1) 
which is, of course, a scalar. Again we see by inspection that the scalar 
product is continuous in any interval not containing a point of discontinuity 
of u(t). 
The vector product u(t) x v(t) is 


i j k 
u(t) X vit) =] sint sect I1/(t — 1) 
tr il+f ef 
giving, 
1+ τ t 
u(t) Χ v(t) = (τ΄ see — — “)i+ (— - etsint) 


+ [U1 + 12) sin t — t sec ¢]k. 


Here also inspection of the components shows that the vector product is 
continuous in any interval not containing a point of discontinuity of u(f). 

The following definition (interpreted later) shows that, as might be ex- 
pected, the idea of a derivative can also be applied to vector functions of one 
variable. 


DEFINITION 11-2 (derivative of vector function) Let u(r) be a continuous 
vector function throughout some interval a< ἐ < ὁ at each point of which 
the limit 
_ u(t + Ar) — u(t) 
lim ------ 
At—0 . At 
is defined. Then u(t) is said to be differentiable throughout that interval with 
the derivative 


du Ι u(t + At) — u(r) 
di kesh At 


The geometrical interpretation of the derivative of a vector function of a 
real variable is apparent in Fig. 11-2. In that figure the curve I is described 
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O Pie 
Fig. 11:2 Geometrical interpretation of du/dr. 


by a point Ρ(ἢ with position vector u(f) relative to O. The point denoted by 
P’(t + A?) is the position assumed by u at time ¢ + Δί, so that OP = u(?), 
OP’ = u(t + At), and PP’ = Au is the increment in u(t) consequent upon 
the increment Af in ἢ. 

It is obvious that as At-—>0, so the vector Au tends to the line of the 
tangent to the curve I’ at P(t) with Au being directed from P to P’. To inter- 
pret du/d¢ in terms of components when u(f) = wi(t)i + ua(t)j + ua(k, we 
need only observe that 


du . ut + At) — u(t) 
im --------“-------- 


dt At—>0 At 
ΒΞ κε + At) — αὶ ἜΣ ae + At) - ποῖ ᾿ 
At—0 At At—0 At ] 
ΓΤ ara a ae k, 
At->0 At 


from which it follows that 


du du, ᾿ dus Ν dus k 


ap ae an {1..9) 


The unit vector T that is tangent to [° at P(t) and points in the direction in 
which P(t) will move with increasing ¢ is obviously 


du /|du 
τοῦ 


dt 
If s is the distance to P measured positively in the sense P to P’ along [° 
from some fixed point on that curve (Fig. 11-2), then we know from our 
work with differentials that duy = μ΄ τα , due = u’edt, dug = u’sdt. Now as 
the differentials dui, due, dug are mutually orthogonal and represent the 
increments in the coordinates [wi(t), w(t), us(t)] of P to an adjacent point 
distant ds away along I’ with coordinates [wi(t + ἀ), ue(t + dd), us(t + da), 


: (11-4) 


SEC 11-1 CURVES IN SPACE / 497 


we may apply Pythagoras’ theorem to obtain 
(ds)? = (uidt)? + (u'edt)? + (μ΄ 845, 

whence | 
ds ‘du. 2 i ‘dus 9 
4“ AN \\ de aL re a ΗΠ: 
τῇ JUG) ἘΠῚ + (=) | (1:5) 
Comparison of Eqns (11-3) and (11-5) then gives the result 


du ds 


from which we see that if ἢ is regarded as time, then the vector function 
vy = du/dr is the velocity vector of P(t) as it moves with speed ds/dt along I’ 
in the direction of T. These results merit recording as a theorem. 


THEOREM 12 Let u(t) = uy(t)i + uo(t)j + ua(t)k be ἃ differentiable 
vector function of the real variable ¢, then 

du dw. 3s dua. αἰ 

dp dr de ar 
If 1 denotes the curve traced out by the point P(t) with position vector u(t) 


as f increases, and s is the distance to P(t) measured along I" from some fixed 
point, then 


and the unit tangent T to the curve [ at P(r) oriented in the sense of increasing 


tis 
r= ἢ} Ὁ 
dr dt 


As a consequence of this theorem we may write 


dus ds (ἀπλ /|du ds 

ee τ--ἰ.. —_—_-$ | = -- T l ° 

dt dt (=)/ dt dt ᾿ oe) 
which is a result of considerable use in dynamics when ¢ is identified with 
time. 

Higher order derivatives such as d?u/d¢? and d3u/dé3 may also be defined 
in the obvious fashion as ἀξ 412 = (d/dz)(du/dt), d®u/dt? = (d/dr)(d2u/dr?) 
provided only that the components of u(r) have suitable differentiability 
properties. Thus, for example, if the second derivatives of the components of 
u(t) exist we have 


ἄπει ἀμ) i+ d2ue . 1 d2u3 
de dr αὶ ap 


k. (11-8) 
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We have seen that if 7 is identified with time and u(f) is the position vector 
of a point P, then du/d/ is the ve/ocity vector of P. It follows from this same 
argument that d?u/d?? is the acceleration vector of P. 


Example 11:2 The position vector r of a particle at time ¢ is given by 
r= acos wti+ asin wij + «t*k, 


where i, j, k have their usual meanings and a, w, and « are constants. Find 
the acceleration vector at time ft, and deduce the times at which it will be 
perpendicular to the position vector. Hence deduce the unit tangent to the 
particle trajectory at these times. 


Solution By making the identifications u=r, wi(t) = acos wt, uo(t) = 
asin wt and us(t) = af? and then applying Theorem 11-2 we find that the 
velocity vector is 


dr : ὦ 
τ; Ξ 40 sin ΟἹ + aw cos wt} + 2atk. 


A further differentiation yields the required acceleration vector 


d’r . : : 
os —aw* cos wti — aw? sin wtj + 2«k. 


Expressed vectorially, the condition that r and d?r/d¢? should be perpendicular 
is simply that r . (d’r/dt#) = 0. Hence to find the time at which this condition 
is satisfied we must solve the equation 


(a cos wti + asin wtj + afk) . (—aw? cos wti — aw? sin wtj + 2k) = 0, 
Forming the required scalar product gives 

— aq? cos? wt — a®w? sin? wt + 2«212 = 0 
which immediately simplifies to 

a2? = 20212, 


showing that the desired times are 


bet oh Ὁ 
~  ga/2 
To deduce the unit tangent T at these times we use the fact that 
dr dr 
T = — aa (ile 
(3)/ dt 


where here 
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dr 


τ; = Vero? + 4555. 


Denoting by Το, the unit tangent to the trajectory att = +aw/a/2, we find 
by substitution of these values of ¢ in the above expression that 


T. == ( Ecce ce δ Υ 
oot 7 ατ δι cos ar ae 
and 
32 (sin ae v2k) 
--ΞΞ — }f Sl ro . 
a ae “i 


With the obvious differentiability requirements, if u(t) and v(¢) are differ- 
entiable vector functions with respect to f, then so also are u+v, u.Y, 
u x vy, and du, where ὁ = φ() is a scalar function of 1. As the following 
theorem is easily proved by resolution of the vector functions involved into 
component form, it is stated without proof. 


THEOREM 11:3 (differentiation, sums and products of vector functions) If 
u(z) and v(t) are differentiable vector functions throughout some interval 
a</t<b and φ(ῇ is a differentiable scalar function throughout that same 
ee then, 


@ Swtya Sts, 

) <u =o ὅδε 

ὁ Sennett tn 
ὦ Suxyeux te xy; 


and, if ς is a constant vector, 
d 
6) —c=9; 
(e) oF 


where the order of the vector prqducts on the right-hand side of (d) must be 
strictly observed. 


When considering the geometry of twisted curves in space it is convenient 
to identify points on a curve I’ by specifying their distance s measured along 
the curve itself from some fixed point. This is of course equivalent to identi- 
fying ¢ with s in the position vector τί) so that I’ is then defined as the locus 
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of the points having the equation r = r(s). This equation is called the intrinsic 
equation of the curve I’. In terms of the intrinsic equation it follows from 
Eqn (11-7) that the unit tangent T to the curve I’ at r = r(s) is 


dr 
T =—- 
ds 
Now although T is a vector function of s, it is also a unit vector, and so 


ΤΟΥ = |. Differentiating this scalar product with respect to s by means of 
Theorem 11-3 (c) then gives 


(11-9) 


dT dT 
— -T+T-— Ξὸ 
ds ε ds 
or, as vectors in a scalar product commute, 
dT 
-— =0, 
ds 


Hence, provided dT/ds ¥ 0, the derivative of the unit tangent T with respect 
to s is normal to T. Next, denoting by N the unit vector along dT/ds, 
we define the essentially positive scalar function « = «(s) by means of the 
equation 


— = XN. | (11-10) 

ds 

Here « is called the curvature of the curve at the point in question, and on 
account of the relationship between T and N, the vector N is called the 
principal unit normal to the curve I at that point. As « is positive by definition 
and N is a unit vector it follows from Eqn (11-10) that 


dT 
ds 


κ-- ΘΞ 


(11-11) 


It is convenient to define a third and mutually orthogonal unit vector B 
called the unit binormal by means of the equation 


B=TXxN. (11-12) 


The three unit vectors B, T, and N are, in general, all functions of s and they 
serve as a specially useful triad of mutually orthogonal unit reference vectors 
at points on the curve I’. It is important to appreciate that in general B, T, 
N, and « vary from point to point on the curve I’, being always defined in 
relation to the local properties of the curve in question. The positive number 
p = 1/« defined at each point of the curve I is called the radius of curvature 
of the curve at that point. 


Example 11:3. Find B, T, N, and the scalars x, p for the curve defined 
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parametrically in terms of ¢ by the expression 
r= 2cos (t+ μὲὶ — 2 sin (t + μὴ} + 4tk, 
where μ is a constant. Hence deduce the values of these quantities at the point 


on the curve corresponding to t = 0. 


Solution First notice that ¢ is not the arc length s along the curve, because 
were this the case then it would follow that ds/dt = 1, whereas from Eqn 
(11-5) we have 


d 
i /[4 cos? (t + w) + 4 51η2 (ἐ + μὴ) + 16] = 2/5. 


Now, using Egn (11-9) we have 


rate == ($)/(5) 
“ἐς dt ds \dt)/ \de) 


whence 
l dr 
ss 24/5 (5) 
Thus 
τι. 86 (ἐ + wi — 2sin(t + μὴ + 411) 
Ὡς ΠΣ. Os 
Ι 
= 2/5 (—2 sin (t + μ)Ρ)ὶ — 2 cos (¢t + w)j + 4k), 
and so 
T= = (sin (¢ + w)i + cos (t + μ)} — 2k). 


Next, to find N and « we write Eqn (11-10) as 


dT dT di (5) ds 
KN Se od --- — i]. 
ds dt ds dt [{Ἰ 


Hence 
cN = Ι τ (ew 7 
2/5 dt 4/5 
ΠῚ (== (t+ wi + sin(t + “ἢ 
24/5 /5 


Using x = | dT/ds |, it then follows that 
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—cos (tf + wi + sin (t + μ)} I 
K = | --όοο------...-.--ὄΦ-------------------------- Oc Dl --- 
10 10 


and, consequently, that 
N = —cos (f + w)i + sin (¢ + p)j. 


Since the radius of curvature p is defined by the relationship p = 1/x, we have 
p = 10. 


Finally, using the definition B = T x N gives 


B= (sin (e+ μὴ + cos (¢ + μ)} — 2k) x 
Χ (—cos (ἐ + w)i + sin (¢ + μ)}}}, 
whence 
Ι 
Β -- -- sc eee 2 cos (t + «)j + k). 


The point on the curve corresponding to ¢t = 0 is r(0) = 2 cos μὲ — 2 sin pj, 
and at this point: 


] 
TiO) = — V5 (sin μὲ + cos uj — 2k), 
N(O) = — cos μὲ + sin pj, 


Ι ; 
Β(0) = -- τς (2 sin wi + 2cos uj + k). 


The curvature « = 1/10 is independent of f, and so « is the same for all 
points on the curve, as is the radius of curvature p = 1/« = 10. 


Thus far we have defined the triad of unit vectors B, T, and N which serve 
as a moving set of reference vectors along the curve I’. We have also cal- 
culated the derivative dT/ds, and to complete our examination of these 
vectors it only remains for us to find dB/ds and dN/ds. For our starting point 
we take Eqn (11-12), which we differentiate with respect to s, using Theorem 
11:3 (d), to obtain 


dB dT dN 
—=— xN4+T xXx — 
ds ἀς i ds 
which, on account of Egn (11-10), reduces to 
dB dN 
Ss ap ye 
ds “ἃς 


Next, forming the vector product of this equation with N and expanding the 
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resulting triple vector product on the right-hand side gives 


dB dN dN 
Nee SING ον ἢ 
"ae ( τ) ον 


However as N is a unit vector it follows, as in the derivation of Eqn (11-10), 
that N.(dN/ds) = 0, whilst the orthogonality of N and T implies that 
N.T = 0. Thus, 


N x — = 0, 
ds 
and hence the vectors N and dB/ds must be parallel, differing only by a scalar 
factor. This scalar factor is usually a function of s and it is called the torsion 
of the curve I’. Torsion is conventionally denoted by —7, so we can write 


= = —7N. (11-13) 
ds 
If required, the torsion 7 may be calculated by using the obvious result 
B 
= -N.&. (11-14) 


See Problems 11-16 to 11-18 for an alternative treatment of the calculation 
of p and τ. 

The manner of construction of B, T, and N is such that they form a right- 
handed set in this order and, consequently, 


B=TxN, T=NxB, N=BxT. (11-15) 


This relationship is indicated in Fig. 11-3 for a point P on the curve I. 

To find dN/ds we differentiate the last result of Eqn (11-15) with respect 
to 5, and use Eqns (11-10), (11-13) together with the other results of Eqn 
(11-15) to obtain, 


dN dB dT 
— = — xX T+ Bx —=-NxXT+x«BXxVN, 
ds ds ds 
whence 
dN ἢ τ 11:16 
τ = BA κΤὶ (11-16) 


The study of the geometrical properties of space curves using the calculus 
techniques is called the differential geometry of curves, and it has as its basis 
the three equations 

dT dB dN 


ez KN, — = --τ-ν, aa 7B — «T, (11-17) 
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Fig. 11-3 Moving triad of reference vectors. 


which are called the Serret-Frenet equations. Naturally, similar ideas lead 
to the differential geometry of surfaces, though we shall make no further 
use of suth ideas in this first account of the subject. 


Example 11-4 Find the torsion of the circular helix of Example 11-3. 


Solution In the previous example it was shown that ds/dt = 1/(24/5) and 
N = — cos(t + μὴ + sin (t + p)j, 


I 
B= — τς 2 sin ( + μὴ -Ἐ 2 οο5 (ἐ + wj +k). 
Hence, 
dB dB ds Ι : ᾿ : 
᾿Ξ (τ) Π] = τ (cos (¢ + μὴὶὶ — sin (¢ + p)j). 
An application of Eqn (11-14) gives 
7 = —4¢[—cos(t+ wit sin(t+ μ}}]. [cos(¢ + wi — sin (ἐ + p)j] = F. 


This result might have been anticipated, for the circular helix in question is 
similar to a screw thread with a constant pitch, and consequently its curvature 
and twist properties must be the same at all points. 


11:2 Antiderivatives and integrals of vector functions 


The notion of an antiderivative, already encountered in Chapter 8, extends 
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naturally to a vector function of a real variable. 


DEFINITION 11-3 (antiderivative—vector function) The vector function 
F(t) of the real variable ¢ will be said to be the antiderivative of the vector 
function f(t) if 


d 


Naturally, an antiderivative F(t) is indeterminate so far as an additive 
arbitrary constant vector C is concerned, because by Theorem 11-3 (e), 
dC/dt = 0. Continuing the convention adopted in Chapter 8, the operation 
of antidifferentiation with respect to a vector function of the single real 
variable ¢ will be denoted by f, so that 


f f()dt.= F(t) + C, (11-18) 


where C is an arbitrary constant vector. 
It is obvious that Eqn (11-18), when taken in conjunction with Theorem 
11-2, implies the following result. 


THEOREM 11-4 (antiderivative of vector function) If 


J f(Odt = F() + Ὁ, 


where f(t) = fi()i + fo(Oj + ak, FCO = A(Oi + Fo(Dj + Fa(Nk and C 
= (Οὐ + ΟἹ + Csk is an arbitrary constant vector, then 


ffiddt=FO+C, i=1,2,3 
with 


Expressed in words, the antiderivative of f(t) has components equal to the 
antiderivatives of the components of f(t). As with the scalar case, in many 
books the entire right-hand side of Eqn (11-18) is loosely referred to as the 
indefinite integral of the vector function f(#), rather than as here using this 
term to refer only to its first member. 


Example 11:5 Find the antiderivative of f(t) given that 
f(t) = cos ti+ (1 + 2?)j + ek. 
Solution It follows immediately from Theorem 11-4 that, 


Jf(odt =ifcostdt¢+jfd+2)dt+kfetd 


3 
= sini t+ (14+ 2}} τ κα 
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The obvious modification to Theorem 11:4 to enable us to work with 
definite integrals of vector functions of a single real variable comprises the 
next theorem. Because it is strictly analogous to the scalar case it is offered 
without proof. 


THEOREM 11-5 (definite integral of vector function) If F(t) is an anti- 
derivative of f(t), then 


" [(Π)41 = (δ) --- F(a). 


Example 11:6 Evaluate the definite integral 


ἀπ 
(t7i + sec? ἢ] + Κ)α 
0 


Solution From Theorem 11-5 we have the result 


da 


3 ἀπ 
(t7i + sec? ἢ + k)dt = (53 + tan tj] + Kt] 
0 


7 ΓῚ 1 
Ξ τοῦ +) + ame. 


0 


A slightly more interesting application of a definite integral is provided 
by the following example concerning the motion of a particle in space. 


Example 11:7 A point moving in space has acceleration 
sin 2ti — cos 21Κ. 


Find the equation of its path if it passes through the point with position 
vector ro = j + 2k with velocity 2j at time ¢ = 0. 


Solution If r is the general position vector of the point at time 1, then the 


velocity v(t) = dr/dt and the acceleration a(t) = d?r/dt?,. Hence 
d2 
= = sin 2ti — cos 2tk, 


so that integrating the acceleration equation from 0 to ¢ and replacing tin the 
integrand by the dummy variable 7 gives 


£ d2r t 
[ (=) ae [ (sin 2ri — cos 2rk)dr. 
 J0 dr2 0 


Hence 


() 


and so 


t 


= —+4(cos 27i + sin 2rk) 


t 
? 
0 0 
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v(t) = vo + 4(1 — cos 2r)i — ὁ sin 2rk. 


Now from the initial conditions of the problem vo = 2j, so that the velocity 
equation becomes 


v(t) = 3(1 — cos 22) + 2j — ὃ sin 21Κ. 


To find the equation of the path a further integration is required so, setting 
v(t) = dr/dt, integrating the velocity equation from 0 to ¢ gives 


t d t 
! [τ] dr = | (g(1 — cos 2r)i + 2] — ἐ sin 27k)dr. 
ο \dr 0 

Hence 


ἰ i 
= }(7 — } sin 27)i + 27j + 4 cos 27k)| , 
0 0 


r(7) 


and so 
r(t) Ξξξ τὸ + 3(¢ — 3 sin 21} + 27 + J(cos 2¢ — )Κ. 


Again appealing to the initial conditions of the problem we find that ro = 
J + 2k, so that, finally, the particle path must be 


r(t) = ἐ( — $ sin 209i + (1 + 22] + 4(7 + cos 20k. 


The form of definite integral of a vector function so far considered is 
itself a vector. We now discuss one final generalization of the notion of a 
definite integral involving a vector function that generates a scalar. 

Let a curve I’ defined parametrically in terms of the arc length s have the 
general position vector r = r(s) and unit tangent vector T(s), and let F(s) 
be a vector function of s. Then at any point of Γ the scalar function ¢(s) = 
Ε(5). T(s) represents the component of F(s) tangential to Γ΄, If the scalar 
function ¢(s) is then integrated from s = a to s = b, this is obviously equiva- 
lent to integrating the tangential component of F(s) along Γ from the point 
r = r(a) to the point r = r(5). An integral of this form is therefore called 
either a /jne integral or a curvilinear integral of the vector function F(s) taken 
along the curve I’, which is sometimes referred to as the path of integration. 


DEFINITION 11-4 (line integral of vector function) The line integral of the 
vector function F(s) taken along the curve I’ between the points A and B 
with position vectors r = r(a) and r = r(d), respectively, is the quantity 


J= [ sos = [F . T ds, 


where ¢(s) = F(s) . T(s), s denotes arc length along Γ, and Τῷ) is the unit 
tangent vector to [, 
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In terms of the general position vector r of a point on the curve and the 
fact that s is the arc length along I’, we obviously have the relationship 
dr = T ds, so that the line integral may also be written 


B 
=| F.dr 


or, more simply still if I’ denotes part of a curve, as 


j= { Far 
r 


In component form, setting the differential dr = ἀχὶ + dyj + dzk and 
F = Fii + Foj + Fak, we have at once 


[δ ἀπ | Fax + Pedy + Fade, (11-19) 
Γ r 


If desired, the line integral (11-19) may be defined vectorially in terms of 
the limit of a sum in a manner strictly analogous to the definition of an 
ordinary definite integral. To achieve this, let the interval a<is< 6 be 
divided into n sub-intervals δὲ.-ἰ --Ξ 5 -Ξ 5, with i= 1, 2, ..., m, where 
So = aand s, = δ. Then setting dr; = r(s;) — r(si-1) as in Fig. 11-4, the line 


οά 


Fig. 11.4 Line integral of F along I’. 


SEC 11:3 SOME APPLICATIONS / 509 


integral (11-19) may be approximated by the sum 
Jn = > F(si) . dri. (11-20) 
t=] 


If the number of sub-divisions 7 is now allowed to tend to infinity in such a 
manner that the lengths of all the sub-divisions tend to zero then, as with an 
ordinary definite integral, we arrive at the result 


B nt 
Ϊ F.dr=lim Σ᾽ F(s;). dri. | (11-21) 


A No t=l1 


When used in this context, the differential dr; is usually called a line element 
of the curve 1 joining A to B. 


Example 11-8 Evaluate the line integral 


[Far 
Γ 


given that F = yzi + xzj + 2xyk and IT is that part of the circular helix 
x =acost, y = asint, z = kt that corresponds to the interval 0 < tf < 2π. 


Solution First we use Eqn (11-19) to write the line integral as 


[Pedr = [τἀν + 2x2 dy + ay de 
Γ Γ 


Now along the path I’ we have the relationships 
x=acost, y=asint, z=kt 

which imply the differential relationships 
dx = —asintdt, dy=acostdt, dz=kdt. 


Hence 


2π 
{ F.dr= Ϊ (—atkt sin? t + 2a2kt cos? 1 + a? sin t cos f)dt 
Γ 0 


2 tsin2t cos 2” 3. tsin 2¢ 
sn Ph eee eG ae 243 
αἰ} 4 ea | + at | + 4 


i COs =] 27 a ἐπ 24 ἕω 
- -- ᾿ς 
8 jo 4 0 


= a*72k, 


11:3 Some applications 


Kinematics, an important branch of mechanics, is essentially concerned with 
the geometrical aspect of the motion of particles along curves. Of particular 
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O x 


Fig. 11-5 Planar motion of particle in terms of polar coordinates. 


importance is that class of motions that occur entirely in one plane, and so 
are called planar motions. In many of these situations, for example, particle 
motion in an orbit, the position of a particle is best defined in terms of the 
polar coordinates (r, 8) in the plane of the motion. Let us then determine 
expressions for the velocity and acceleration of a particle in terms of polar 
coordinates. 

We first appeal to Fig. 11-5, which represents a particle P moving in the 
indicated direction along the curve I’. The unit vectors R, Θ᾽ are normal to 
each other and are such that R is directed from O to P along the radius 
vector OP, and @ points in the direction of increasing 0. Then clearly R and 
© are vector functions of the single variable θ, with 


R = cos 6i+ sin 6j and Θ = —sin θὲ + cos 6j. (11:22) 
It follows from these relationships that 
dR dO 
δ Θ and a —R. (11-23) 
In terms of the unit vectors R, © the point P has the position vector 
r=/7R, (11:24) 
so that the velocity dr/d¢ must be 
dr dr dR 
de de ae 
_ dr Rer dR dé 
di d6 dt’ 


showing that the velocity vector of P is 


r = rR + 760, (11-25) 
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where differentiation with respect to time has been denoted by a dot. 

Here the quantity 7 is called the radial component of velocity and r6 is 
called the transverse component of velocity. A further differentiation with 
respect to time yields for the acceleration vector f = d2r/dt? the expression 


ἔ τ ΓΒ Ὁ rR + 60 + 760 + r60 
or 
.dR Ν dO 
ws ὁ ΟΝ : . Ὁ 
i FR + γθ τὸ + (76 + 76) + r6 6 
Hence by Eqn (11-23) this is seen to be equivalent to 
i = (ἢ — r62)R + (276 + 76)O. (11-26) 


The quantity ἢ — ro? is called the radial component of acceleration, and 
276 + r6 is called the transverse component of acceleration. 


Example 11:9 A particle is constrained to move with constant speed v 
along the cardioid r = a(1 + cos @). Prove that 
; 6 
= 2a6c -Ξ [|» 

v αθ cos ( 5] 
and show that the radial component of the acceleration is constant. 
Solution From Egn (11-25) and the expression r = a(1 + cos 9), it follows 
that the velocity vector r is given by 

i = —asin 06R + a(l + cos 0)60. 
Now as υ =r? = Γι, we have 

v2 = a262 sin? 6 + a262(1 + cos 6)? = 2a262(1 + cos 4). 


Using the identity 1 + cos θ = 2 cos? (6/2) in this expression and taking the 
square root yields the required result 
v = 2a6 cos (6/2). 


To complete the problem we now make appeal to the fact that the radial 
acceleration component is ἢ — r@2, whilst by supposition v = constant. 
From our previous working we know that 


υϑ = 2a262(1 + cos 9), 
so that differentiating with respect to ¢ and cancelling 6 gives 
_ 6 sin 6 
~ 2(1 + cos 6) 
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or, as 
ge 
~ 2a%(1 + cos 6) 
ὃ v? sin θ 
~ 4a2(1 + cos 6)? 


Hence as ἢ = —a(cos 092 + sin 06), substituting for r, 62, and @ in the radial 
component of acceleration we find, as required, that 


᾿ v2 
ἢ — 792 = ao constant. 
a 


A vector treatment of particle dynamics follows quite naturally from the 
ideas presented so far. Thus a particle of variable mass m moving with velocity 
v has, by definition, the /inear momentum M, where 


M = mv. 


Now by Newton’s second law of motion we know that, with a suitable choice 
of units, we may equate the force F to the rate of change of momentum, so 
it follows that we may write 


dM 
F = —-.- 
d 
However, 
ΟΜ dm dy 
dt dt dt 
and hence 
eae ΟΥ̓ 11-27 
~ dt dt eo 


In the case of a particle of constant mass m, we have dm/dt = 0, reducing 
Eqn (11-27) to the familiar equation of motion 


F = ma, (11-28) 


where a = dv/dt is the acceleration. 

Similarly, the angular momentum of a particle of fixed mass m about the 
origin is defined by the relation $2 = r x mv, where r is the position vector 
of the particle relative to the origin and v = dr/d? is its velocity. Then the 
rate of change of angular momentum about the origin is 

dv 


— = mvxvtmy x — 
dt dt 


=rxF, (11-29) 
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by virtue of Eqn (11-28). This is the vector form of the principle of angular 
momentum, which asserts that the rate of change of angular momentum 
about the origin is equal to the moment about the origin of the force acting 
on the particle. 


The line integral 


J= Ι F.dr 
Γ 
also occurs naturally in many contexts, perhaps the simplest of which is in 


connection with the work done by a force. If F is identified with a force, and 
dr is a displacement along some specific curve Γ᾿ joining points A and B, then 
J represents the work done by the varying force F as it moves its point of 
application along the curve I’ from A to B (cf. Fig. 11-4). 

In the special case that F is a constant force and I" is a straight line segment 
with end points at 5 = a and s = b this simplifies to an already familiar 
result. Suppose that F = Fa and dr = ds®, where a, B are constant unit 
vectors inclined at an angle 6, then 


J= [τι dr= Fag) [ὦ 


A 
= F(b — a) cos θ. 


Thus, as would be expected in these circumstances, the work done by F is 
the product of the component F cos @ of the force F along the line of motion 
and the total displacement (b — a). 

The line integral also occurs in fields other than particle dynamics, and in 
fluid mechanics for example, if F is identified with the fluid velocity v and [° 
is some closed curve drawn in the fluid, then the scalar quantity y defined by 
the line integral 


y= | vid 
Γ 


is called the circulation around the curve I’. In more advanced works it is 
shown that y provides a measure of the degree of rotational motion present 
in a fluid. For a special class of fluid flows known as potential flows the cir- 
culation is everywhere Zero, irrespective of the choice of I’. These flows are 
said to be irrotational and are of fundamental importance. Line integrals 
around closed curves are generally denoted by the symbol ¢ with the conven- 
tion that the path of integration is taken anti-clockwise, so that for the 
circulation y we would write 


y= φ ν᾿ ἀν, 
Γ 


A reversal of the direction of integration around I’ would change the sign 
of γ. 
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An exactly similar application of the line integral occurs in electromagnetic 
theory, where the electromotive force (e.m.f.) between the ends A and B of a 
wire coinciding with a curve Γ is related to the electric field vector E by the 
line integral 


B 
e.m.f. -| E. dr. 


A 


Example 11:10 Find the work done by a force F = yzi + xj + xzk in 
moving its point of application along the curve Γ defined by x = #, y = 13, 


z = 15 from the point with parameter t = 1 to the point with parameter 
t= 2. 


Solution 


Work done = ] 


F.dr= { (yzi + xj + xzk). (dxi + dyj + dzk) 
Γ Ἃ 


= [τάν τ xdy + xzdz 
r 


Now asx = 14, y = 13, 2 = 23, it follows that 
dx=dt, dy =2tdt, dz = 312 dt 


and se, substituting in the above expression, we find 


2 
Work done = Ϊ (415 + 213)41 = 140/3 units. 
1 


Example 11-11 If the fluid velocity v = x?yj, determine the circulation 
y οὖν around the contour I’ comprising the boundary of the rectangle 
x= +ta,y= +0. 
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Solution By definition, the circulation y is 


yah v.dr=d xi. xi + dyj + doh) 
Γ r 


= φ x?y dx, 
r 


where the direction of integration is anti-clockwise around I’. Now the line 
integral around I’ may be represented as the sum of four integrals as follows, 


Q R 5 P 
y=| xty dx + | xty dx + | xty dx + | x*y dx, 
P Q R Ss 


where the limits refer to the corners of the rectangle in Fig. 11-6. 

The first and third integrals vanish since x is constant along PQ and RS, 
with the consequence that dx = 0. Along QR, y = ὃ and along SP y = —8, 
so that 


—4a%b 


-α a 
γ - bx? dx + | —bx? dx = 


πα 


11.4 Fields, gradient, and directional derivative 


The scalar function ¢ = ν( — x?) + νᾷ — y2) + (1 — 2?) is defined 
within and on the cube shaped domain |x| <1, | y|<1, |z|<1 and 
assigns a specific number ¢ to every point within that region. In the language 
of vector analysis, ¢ is said to define a scalar field throughout the cube. In 
general, any scalar function ¢ of position will define a scalar field within its 
domain of definition. A typical physical example of a scalar field is provided 
by the temperature at each point of a body. 

Similarly, if F is a vector function of position, we say that F defines a 
vector field throughout its domain of definition in the sense that it assigns a 
specific vector to each point. Thus the vector function F = sin xi + xyj + 
ye*k defines a vector field throughout all space. 

As heat flows in the direction of decreasing temperature, it follows that 
associated with the scalar temperature field within a body there must also be 
a vector field which assigns to each point a vector describing the direction and 
maximum rate of flow of heat. Other physical examples of vector fields are 
provided by the velocity field v throughout a fluid, and the magnetic field H 
throughout a region. 

To examine more closely the nature of a scalar field, and to see one way 
in which a special type of vector field arises, we must now define what is 
called the gradient of a scalar function. This is a vector differentiation 
operation that associates a vector field with every continuously differentiable 
scalar function. 
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DEFINITION 11-5 (gradient of scalar function) If the scalar function 
d(x, y, Z) is a continuously differentiable function with respect to the inde- 
pendent variables x, y, and z then the gradient of ¢, written grad ¢, is defined 
to be the vector 


¢ s 
pra φ ᾿ 1+ ν᾽ + k, 


For the moment let it be understood that r = xi + yj + zk is a specific 
point, and consider a displacement from it dr = dxi + dyj + dzk. Then it 


follows from the definition of grad ¢ that 
dr. grad 6 = --- az x+ δι ἀν + as, 
in which it is supposed that grad ¢ is evaluated atr = xi + yj + zk. Theorem 


5-19 then asserts that the right-hand side of this expression is simply the total 
differential dé of the scalar function ¢, so that we have the result 


ἀφ = dr. grad φ. | (11-27) 

If we set ds = | dr |, then dr/ds is the unit vector in the direction of dr. 
Writing a = dr/ds, Eqn (11-27) is thus seen to be equivalent to 

d 

=a. grad ¢. (11-28) 


Because ἃ. grad ¢ is the projection of grad ¢ along the unit vector a, expres- 
sion (11-28) is called the directional derivative of ᾧ in the direction of a. 

In other words, a . grad ¢ is the rate of change of ¢ with respect to distance 
measured in the direction of a. We have already utilized the notion of a direc- 
tional derivative in connection with the derivation of the Cauchy—Riemann 
equations, though at that time neither the term nor vector notation was 
employed. 

As the largest value of the projection a. grad ¢ at a point occurs when a 
is taken in the same direction as grad ¢, it follows that grad ¢ points in the 
direction in which the maximum change of the directional derivative of ¢ 
occurs. 

In more advanced treatments of the gradient operator it is this last 
property that is used to define grad 4, since it is essentially independent of 
the coordinate system that is utilized. From this more general point of view 
our Definition 11-5 then becomes the interpretation of grad ¢ in terms of 
rectangular Cartesian coordinates. 

The vector differential operator V, pronounced either ‘del’ or ‘nabla’, is 
defined in terms of rectangular Cartesian coordinates as 


ο ὃ 2 
τος ἀπ ὑπὸ 11-29 
‘px tay t Kaz ( ) 
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As the name implies, V is a vector differential operator, not a vector. It 
only generates a vector when it acts on a suitably differentiable scalar func- 
tion. We have the obvious result that 


.,δὃϑ ὃ 
grad ᾧ = τι ite δῷς τρις τις τὰς <) $= Vd. (11-30) 


Example 11:12 Determine grad¢ if ¢ = z2cos(xy — ζπ), and hence 
deduce its value at the point (1, ἔπ, 1). 


Solution We have 


Od ὃ 

eee ΕΒ ΤΣ, Ξε. a Se Ὁ π = 

Ἵν yz* sin (xy -- 1πῚ, By xz? sin (xy — {πὴ 
and 

a 

= = 2z cos (xy — 4a). 
Hence, 

Op. ob. oi 


grad φ =o 1 + 5, 8 aes 
oy 
= —yz* sin (xy — ἐπ) — xz? sin (xy — }πὴ} + 22 cos (xy — 1πὴΚ. 


At the point (1, 47, 1) we thus have 


1 
(grad φλι, jn, 1) = 172 (--ἨἰἝἐ.πὴϊ -- j + 2k). 


Example 1113 If r = xi + yj + zk, and r = |r|, deduce the form taken 
by grad γῆ. 


Solution As r= (x? + y® + 2?)1/2, it follows from Egn (11-30) and the 
chain rule that 


δ ὃ ὃ 
grad γ = (i= + j— +k=] rh 
Ox oy OZ 


However, 


Or χ r sey or 
r 


518 / SCALARS, VECTORS AND FIELDS CH 11 


and so 
grad r® = nr®-2(xi + yj + zk) = nr®*r. 


The following theorem is an immediate consequence of the definition of 
the gradient operator and of the operation of partial differentiation. 


THEOREM 11-6 (properties of gradient operator) If ¢ and ψ are two con- 
tinuously differentiable scalar functions in some domain D, and a, ὃ are 
scalar constants, then 

(a) grada = 0; 

(Ὁ) grad (ad + by) = a grad Φ + ὃ grad yp; 

(c) grad (¢ y) = ὁ grad p + y grad φ. 


The surfaces ¢(x, y, z) = constant associated with a scalar function ¢ are 
called /evel surfaces of φ. If we form the total differential of ὁ at a point on a 
specific level surface ¢ = constant then ἀφ = Ὁ and, as in Eqn (5-23), we 
obtain the result 


ax + ΟΡ ay + faz =O. 
oy 


This is equivalent to 
dr. grad dé = 0, (11-31) 


where now dr is constrained to lie in the level surface. | 

This vector condition shows that grad ¢ must be normal to dr, and as dr 
is constrained to be an arbitrary tangential vector to the level surface at the 
point in question, it follows that the vector grad ¢ must be normal to the level 
surface. The unit normal n to the surface is thus n = grad ¢/| grad ¢ |. 
Notice that this normal is unique apart from its sign. This simple argument 
has proved the following general result. 


THEOREM 11-7 (normal to level surface) If¢ is acontinuously differentiable 
scalar function, the unit normal ἢ to any point of the level surface ¢ = con- 
stant is determined by 


_ gradd 
| grad ¢ | 


Example 11:14 If ¢ = x? + 3xy? + yz? — 12, find the unit normal n to 
the level curve ¢ = 3 at the point (1, 2, 1). Deduce the equation of the 
tangent plane to the level surface at this point. 


Solution The level surface ¢ = 3 is defined by the equation y = 0, where 
y= x2 + 3xy? + yz3 — 15 = 0. 
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Hence 

grad ψ = (2x + 3y?)i + (6xy + 23)j + 3yz?k 
which, at (1, 2, 1), becomes 

(grad p)c1,2,1) = 141 + 13] + 6k. 


As y = Ὁ is the desired level surface, it follows from Theorem 11:7 that the 
unit normal to this surface at the point (1, 2, 1) must be, 


πος 141 Ὁ 13) + 6k [41 +13 + 64 
114)? + (13)? + (6?) V401 


Now the equation of a plane is n.r = p, where r = xi + yj + zk is a 
general point on the plane, n is the unit normal to the plane, and p is its 
perpendicular distance from the origin. The point ro = i + 2j + k is a point 
on the plane so thatn.r = n.ro (=p). Hence 

14i + 13j + 6k 141] + 13] + 6k 
( V/401 4/401 


showing that the required equation is 


14x + 13y + 62 = 46. 


)-Gi + yj + 29 = ( )-@+2) +0, 


We have seen how the gradient operator associates a vector field grad ¢ 
with every continuously differentiable scalar field ¢. Any vector field F = 
grad @ which is expressible as the gradient of a scalar field ¢ is called a 
conservative vector field, and ¢ is then referred to as the scalar potential 
associated with the vector field. 

This has an important implication when line integrals involving con- 
servative vector fields are considered. Let us suppose that F = grad ¢, and 
that 


B 
7. F . dr. (11-32) 
, | 
Then 
B 
=| grad ¢. dr, (11-33) 
A 


and by virtue of Eqn (11-27) this can be written 


B 
7) -- Ϊ ἀφ = 4(B) — 4(A). (11-34) 
A 


Hence when F belongs to a conservative vector field, results (11-32) and (11:34) 
show that the line integral J of F depends only on the end points of the path 
of integration, and not on the path itself. 

This fundamental result has far reaching consequences and forms the 
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basis of many important developments, of which gravitational potential 
theory is but one. Suppose, for example, that F is identified with a conserva- 
tive force field, then result (11-34) represents the change in the potential 
energy of a particle as it moves from point A to point B. That J depends only 
on the difference ¢(B) and (A) and not on the path joining A to B explains 
why, when using potential energy considerations in mechanics, no considera- 
tion need be given to the path that is followed. 


11:5 An application to fluid mechanics 


Let the velocity field v in a fluid as a function of position (x, y, z) and the 
time ¢ be denoted by 


V = vii + στο] + vsk, (11-35) 


where 1; = v(x, y, 2, ἢ for i = 1, 2, 3. Clearly, if at any fixed time ἐ = ἢ), 
dr denotes a differential displacement along the line of flow at the point with 
position vector r = r(x, y, Z, fi) then dr must be parallel to the velocity 
vector v at that point. Hence the respective components of dr and v must be 
proportional. The lines determined in this manner, which are everywhere 
tangential to the velocity field vector, are called the streamlines of the flow 
field. More properly these should be called stream surfaces since in three 
space dimensions they correspond to surfaces. In the case of a general vector 
field F, not necessarily defining a velocity field, they are called field lines. 
The condition that dr must be parallel to v implies that the field lines or 
streamlines must satisfy the equations 

αὐ ας ὍΣ (11-36) 

VL v2 v3 
Equations of this form are called differential equations, and methods for 
their solution will be explored systematically in the last three chapters. 

If, now, r is the position vector of a fluid particle at time ¢, we have the 
obvious vector equation 


dr 

— =Y, 11-37 

rae! (11-37) 
which implies the three scalar differential equations 

dx dy dz 

ae = V1, ae = vo, ar = U3. (11 38) 


Together, the solutions of these last three equations define curves called the 
particle trajectories. The particle trajectories are functions of the time, and 
are so named because they describe the path followed by individual fluid 
particles, as they pass through the flow field. 
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Example 11:15 Find the differential equations of the streamlines and 
particle trajectories corresponding to the fluid velocity field v = 2/2xi + 
ty) + 3z°k. 


Solution In this case v1 = 22x, ve = ty, and vg = 3z?, showing that the 
differential equations describing the streamlines are 


dx ἀν 


dz 


whilst the differential equations describing the particle trajectories are 


dx dy dz 
--- = 212 —_ > — = 3 2. 
de ag ee cee 


In this case all of these differential equations are of the type called variables 
separable which may be solved by direct integration after some slight re- 
arrangement. Although the main discussion of the solution of differential 
equations will be postponed until the final chapters of this book, it will be 
instructive to solve the ones that have arisen in connection with this problem. 

The differential equations defining the streamlines are equivalent to two 
different relationships between the three space variables x, y, and z. We 
choose to work with the first and last pairs of the equations which are, 
respectively, equivalent to 


dx d d td 
— = 2 sid and ἘΠ Ξε - ac 
x y y 32 
with ¢ regarded as a constant. Taking the antiderivatives of these gives 
mf 
log x = 2i{log y + constant} and logy = ae + constant. 
Ζ 
Re-arrangement shows that the streamlines or stream surfaces are described 
by the equations 
x= (Ciy δὲ γ ΜΡ οὐ 2/8_— 132. 


where Cj, C2 are arbitrary constants. If flow in the plane z = zo is considered, 
then these equations define a curve that is correctly called a streamline. It 
would be the curve 


2 — 942 as 
x= Cy 2te2C2t 88 —2t 1320 | y= eK 2tl3, t/3z9 


The particle trajectories are found in similar fashion by finding the 
antiderivatives of 
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Hence 


2 t? 
log x = 3 2+ constant, logy= 5 + constant, 


l 
— -—-= 3t+ constant, 
: Z 
showing that 


l 
3/3 12/2 = 
x= C3e™"! = C4e 25> 
; ‘4 ; Cs — 3t 


where C3, Ca, and Cs are arbitrary constants. The position vector of a 
particle must thus be 


r= C3e2"/33 + Cye” 27 4+ ———— k. 
3 + Cae" + σι -- 3, 


PROBLEMS 


Section 11:1 

11-1 Sketch and give a brief description of the curves described by the following 
vector functions of a single real variable r: 
(a) r = acos 2ati + bsin 2π|] + tk; 
(b) r = acos 2xti+ ὁ sin 2atj + 7k; 
(c) r = tit 1 + 2k. 

11:2 State which of the following vector functions are everywhere continuous and, 
if they have points of discontinuity, where these occur: 


l 2t— 1 
— rs 1 i + 2 *. 
(a) u() (; + sin? ᾿ εὐ [ + a) pee 


t 
(b) u(t) = fe i+ tan 1] + tek; 
(c) u(t) = tanh ti + cosh ἢ] + ¢ sinh ἐκ; 


(d) u(t) = [ ἘΠ) i+ |sin¢|j+ 3k. 


11-3 A vector function u(r) of a real variable t may be assigned left- and right-hand 
limits u(fo—) = lim u(t) and u(to+) = lim u(t) with respect to the point fo 
t—tg — t—>tg + 
in an obvious manner. The vector function u(t) will be continuous at t = fo 
if u(fo—) = u(fo+). Use this concept to determine which of the following 
vector functions are continuous at the stated points: 


sinh 2¢ 
t 


(a) u(t) = 


i+ 16 + cosh tk at t = 0; 


Καί ἐ = O; 


(Ὁ) u(t) = sin ti+ --- - ἮΝ 2 31. 


t 1 + 34 
t” = aq” 


t—a 


(c) u(t) = ( i+ cosh ¢j + tanh ἐκ at ¢t = a. 


11-4 


115 


116 


11-7 


11:8 
11:9 
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Form the vector functions u(t) + v(t), u(t) Χ v(t), and the scalar function 
u(t) . v(t) given that: 


: ᾿ , Ι -- fr? 
u(t) = ΕΓ + sinh 1} + ΓΞ, k 
and 
v(t) = 2ti + cosh ¢j + sin rk. 


Determine du/dt and d2u/dt? for the vectors u defined in (a) and (c) of 
Problem 11:1, and find du/d¢t for the vector 


dr 
dr2 


where r = t(t), r = [τ and a, b are constant vectors. 


r 
u=—+(a.rb+ax 
r 


The position vector of a particle at time ἢ is 
r = cos (( — 1)i + sinh (¢ — 1)j + afok. 


Find the condition imposed on « by requiring that at time ὁ = 1 the accelera- 
tion vector 1s normal to the position vector. 


Find the unit tangent T to the curve 

r= tit (7) + εκ 
at the points corresponding to ¢ = Ὁ and ¢ = 1. 
Prove results (a) to (c) of Theorem 11:3. 


Prove result (d) of Theorem 11:3: (a) by expansion of the vector product 
u X v followed by subsequent recombination of the results; and (b) using 
determinants. 


11:10 Find B, T, and N when ¢ = ἐπ, given that 


11-11 


11:12 


r = (cos ¢ + sin? t)i + sin (1 — cos ¢)j — cos ἐκ. 
Find B, T, N, «, and τ for the helix 

r = (i — cos¢)i + sin ¢j + tk, 
when t = ἐπ. 


Prove that if r(¢) is a suitably differentiable function of the real variable f, 
and s is the arc length along the parametrically defined curve r = r(t), then 
with the usual notation 


dr ds 
dp des” 
dr ἀδς ds\ 2 


Hence show that 


dr d’r _ ds\" ἢ 
4 dr2 14] ” 


and deduce that 
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dr. dr dr dr 
_ dt de? 1 dt de? 
x oar 
dt = dr? dt 


11-13 Apply the results of Problem 11-12 to deduce B, T, and N for the curve 
r= ti+ 153] + 0°k 
when ¢t = 1. 


11:14 This problem gives an elementary derivation of the radius and circle of 
curvature for a plane curve. If at a point (¢, 7) on a curve y = f(x), a circle 
of radius p and centre («, 8) is tangent to the curve and has the same second 
derivative, then it is called the circle of curvature at (ξ, ἡ). The number ρ 15 
called the radius of curvature and («, 8) the centre of curvature. 

Let the circle of curvature at (ξ, 7) have the equation (X — a)? + 
(Y — f)2 = p2, where (X, Y) is a general point on the circle (see figure). 
By differentiation of this equation with respect to X, and using the tangency 
condition dY/dX = f’(§) at (é, 7), show that 


(ξ -- a) +(n— AFH = 0. 


By a further differentiation of the equation with respect to Δ΄, and by using 
the equality of second derivatives d? Y/dX2 = f’(4) at (6, 7), show that 


1+ (f())? + @ -- β΄ (ὃ = 0. 
Use the fact that (&, 7) lies on the circle of curvature to deduce that: 
fO : 
= - --, 1 ξ 2 
am € FPO + (f(45)*) 
I 


= ey 4 | (8) 
β 1+ Bee | + (f(5))?) 
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and that 
ἘΣ ΝΕ 
ΚΘ Ι 


Find the centre and radius of curvature of the curve y = 1 + x? at the 
point (1, 1). 


11:15 Use the results of Problem 11:12 to show that the circular helix 
r = acos ti + asin tj + dtk 
has the constant radius of curvature p = (a2 + b?)/a. 


11:16 Show from the Serret-Frenet equations and the fact that T = dr/ds, that 
the torsion τ may be expressed in the form: 


dr [a'r d's 
ds |ds? ds? 


ἢ 


11:17 If r = γί), where the parameter ¢ is not the arc length along the curve, 


Tr = 


prove that 
dr drds 
dt dsdi 
d’r = d’r /ds\? dr d#s 
dt2— ds? (5) ds dr? 


dir _ dir (ds\? | ,d’rdsd’s ἀν ds 
413 448 \ dt ds2 dt dt2 © ds dt 


Hence show that 
dr (d’r ἀπ) ἀγ (d’r ἀϑτὴ /dsy* 
dt \dt2~ de3} ἀς (as ds?) \de 

ds dr 
and, as ( a) - 


rr , that 
de [ate ἀν 
dr ΤΕ a _ dt | dt? de? 


-..... Χ --- 
ds? ds8 


ds dr }3 
dt 

Use the result of Problem 11-12 to deduce that 

dr : d*r x d°r dr dr 

dt |dr? des Ι dt 4, 

ΕΝ Tae 
dt de? dt 
11:18 Apply the result of Problem 11-17 to find the torsion τ of the non-constant 

pitch helix 


t _ et 
rm cos ri + sin gj + (° > )x 
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Section 11:2 
11:19 Find the antiderivative of the following two functions f(t): 


(a) f(t) = cosh 2¢i + i + ¢3k; 


(Ὁ) f(t) = ¢? sin ti + οἴ + log tk. 
11:20 Verify the following antiderivatives using Definition 11-3: 


@ {(r. a) =e. r)t+ C= 4r°4+C; 


d’r I dr dr 1 /dr\2 
(6) [( i) ὁ ἐπα "6 -π (al Gs 
(c) [rx Ge ie OE ace 


where C, C are arbitrary constants. 


11:21 Use the result of Problem 11:20 to express dr/d¢ in terms of r, given that r 
satisfies the vector differential equation 


11:22 Evaluate the definite integral 
2 
| (tei +- t log tj + ¢#k)dr. 
1 


11:23 The displacement of a particle P is given in terms of the time ¢ by 
r = cos 211 + sin 21] + εκ. 


If v and f are the magnitudes of the velocity and acceleration respectively, 
show that 


502 = fA1 + £2). 


11:24 A point moving in space has acceleration cos fi + sin tj. Find the equation 
of its path if it passes through the point (—1, 0, 0) with velocity —j + k at 
time ὦ = 0. 


11-25 Evaluate the line integral of F = xyi + yzj + zk along the contour defined 
byr = ti+ 1 + εκ from t= Otot= 1. 


11:26 Evaluate the line integral of F = 4xyi — 2x?j + 3zk from the origin to the 

point (2, 1, 0) along the contour: 

(a) from the origin to the point (2, 0, 0) and then from the point (2, 0, 0) to 
the point (2, 1, 0); 

(Ὁ) from the origin to the point (0, 1, 0) and then from the point (0, 1, 0) to 
the point (2, 1, 0); 

(c) from the origin to the point (2, 1, 0) along the straight line joining these 
two points. 

(Hint: the contours (a), (Ὁ), (0) all lie in the plane z = 0.) 


11-27 Evaluate the line integral F = 4xyi + 2x?j + 3zk from the origin to the 
point (2, 1, 0) along the contours of Problem 11:21. 
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Section 11:3 
11:28 A particle moves in a curve given by 


dé 
— —_— θ [ een e 
r=a(l—cos 9) with oF 3 


Find the components of velocity and acceleration. Show that the velocity is 
zero when 6 = 0. Find the acceleration when 6 = 0. 


11:29 A particle moves on that portion of the curve r = ae® cos θ (a = constant) 
for which 0 < @ « }r, so that its radial velocity u remains constant. Find 
its transverse velocity and its radial and transverse components of acceleration 
as functions of uw and θ. 


11-30 If the fluid velocity v = yi + 2xj, determine the circulation γ by integrating 
anti-clockwise around the rectangular contour x = +a, y = +b. Show that 
the sign of γ is reversed if the direction of integration is taken clockwise 
around the same contour. 


11-31 Consider the three rectangular regions (4) 0<x<1, -Il<y<1l, 
(0) O<x¥<1,1<y<2, and (Ὁ O<x <1, -1 <y <2 and denote 
their boundary curves by Ti, I2, and 13. If F = 2yi + xj, evaluate the three 
line integrals 


n= | F . dr, κε F . dr, =| F . dr, 
Ty Ig Ig 


and hence show that J; + Jo = Js. 


11-32 Given that F = cos yi + sin xj, evaluate the line integral of F taken anti- 
clockwise around the triangle with vertices at the points (0,0), (ἐπ, 0), 


(27, $7). 


11-33 A vector field F is said to be irrotational if its line integral around any closed 
curve I’ is zero. By integrating around two conveniently chosen contours, 
deduce which of the following vector fields are irrotational: 

(a) F = y sinh zi + x sinh zj + xy cosh zk; 
(Ὁ) F = xi + yj + zk; 
(c) F = xyz*i + x2z°j + x2yzk. 

Section 11-4 

11-34 Find the gradient of the following functions ¢: 
(a) ¢ = cosh xyz; 

(Ὁ) φ Ξ- x7 + y? + 22; 
(c) ¢ = xy tanh (x — 2). 


11-35 Find the directional derivative of the following functions ¢ in the direction 
of the vector (i + 2j — 2k): 
(a) φ = 3x? + xy? + yz; 
(Ὁ) φ = x*yz + cos y; 
(c) ¢ = 1/xyz. 


11-36 If new independent variables ¢, ἡ, ξ are introduced through the equations 
f=x+a,7=y+ β, and ¢=2z-+ γ, where a, β, and y are constants, 
and ¢ is a suitably differentiable function, prove that 

8 (85 μὴν a (Pogue κϑ ; 
tax ν᾿ Sag)? ἰδὲ ta, + Kaz) ἢ 
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Deduce from this result the fact that grad ¢ is unchanged by a translation 
of the origin of the coordinates. This property is described by saying that 
grad ¢ is invariant with respect to a translation of the coordinate system. 


11-37 If new independent variables ξ, ἡ, ζ are introduced through the equations 
ξ = aux + aiey + aisz, 
ἢ = 21x + a22y 4282, 
€ = asx + asey + (1382, 
and ¢ is a suitably differentiable function, prove that 


. δ᾽. .. ὃ δα, (ὃ); .ὃ ὃ 
ig εἰν ἐκ} ὁπ (igtig+ ka) 


Deduce from this result the fact that grad ¢ is unchanged by a rotation of 
the coordinate system. This property is described by saying that grad ¢ is 
invariant with respect to a rotation of the coordinate system. 


11-38 If a is a constant vector and r = xi + yj + zk, r = | r| prove that 


(a) grad (a.r) =a; 
(b) gradr =r; 


(c) grad (7) =— 5 


11-39 By using the Cartesian representation of grad ¢ as expressed in Definition 
11:5, prove that 
(a) grad (a¢ + by) = agrad ¢ + ὁ grad y; 
(b) grad (dy) = ¢ grad y + y grad 4, 
where a, ὃ are scalar constants and φ, ψ are suitably differentiable functions. 


11-40 A vector field F will be irrotational if it is expressible in the form F = grad ¢, 
with ¢ a scalar potential. Find the most general scalar potential ¢ that will 
give rise to the irrotational vector field 


F = (x + 3y2z)i + xyz?*j + xy?zk. 


11-41 Find the unit normal n to the surface x? + 2y? — z? ~ 8 = 0 at the point 
(1, 2, 1). Deduce the equation of the tangent plane to the surface at this 
point. 


11-42 Find the unit normal n to the surface x2 — 4y? + 2z2 = 6 at the point 
(2, 2, 3). Deduce the equation of the plane which has n as its normal and 
which passes through the origin. 

11:43 If (xo, yo, Zo) is a point on the conic surface 

x2 γὃὲ ζῇ 


“«-. -- 1, 


ab 
show that the tangent plane to the surface at that point is 


a ὃ c 


11-44 The vector field F is generated by the scalar potential ¢ = x2y. Verify 
directly by integration that the line integral of F along each of the three paths 
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of Problem 11-26 is equal to 4. Confirm this result by using the fact that if 
F = grad ¢, then 


B 
{ Ε. dr = 4(B) — 4(A). 
A 


11:45 The Newtonian law of gravitation asserts that the force of attraction between 
point masses m1, me distant r apart acts along the line joining them and has 
magnitude (Gm me)/r?, where G is the gravitational constant. Show that 
this force law corresponds to a potential 6 = (Gm mo)Jr. 


11-46 If ν = vii + vaj + vak is a vector field, then the scalar operator v. grad 
expressed in Cartesian coordinates is defined to be 
δ ὃ ὃ 
; ξεν ὕξευι-- = = 
Vv. grad ξεν fora ΤΆΤ 


Hence if F, ¢ are suitably differentiable vector and scalar fields, respectively, 
it follows that v . grad ¢ is a scalar and v. grad F is a vector. Given that 


d= x8yz2, v= xyit yj txzk, F= xi + yj — 27k, 
find 
(a) v. grad ¢; 
(b) v. grad F; 
(c) v. grad v. 


11-47 Special differential operators called the divergence and the curl of a vector 
can be defined in terms of Cartesian coordinates by means of the operator V. 
if F = Fii+ Fj + ΒΚ is a suitably differentiable vector field, then the 
divergence of F is denoted either by div F or V . F and is the scalar defined 


by 
, OF; OF > OFs3 
divF=V.F= aracs 2y igs 


The curl of F is denoted either by curl F or V x F and is the vector defined by 


ij k 
re ὃ 0 
curlF= Vx F= oa By rea 
Fi Fo Ps 
Show that 
OF3 OF 2\., OF; OF3\ . ΘΕ) OF; 
oem πὰ (ea) (-F) 


If % is a suitably differentiable scalar function show by direct substitution 
into the definitions that 


(a) div (¢F) =F. grad 4 + ¢divF; 
(b) curl (¢F) = F x grad ¢ + ¢curl F. 


11-48 Find V.F and V x F given that 
Ε -- x*y9i + y2z?j + xzk. 
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11-49 Prove from the definitions that 

(a) curl grad ¢ = 0 

(b) divcurlF =0, 

where φ, F are suitably differentiable scalar and vector functions respectively. 
11:50 Give an example of a differentiable scalar potential ¢ and vector field F. 


Use them to confirm the results of Problem 11-49 by means of direct differ- 
entiation. 


11:51 In the slow one-dimensional flow of a viscous fluid between parallel plates 
the velocity field has the form 


2 
v=o (1-3) Κ 


where the plates coincide with the planes x = +d and the z-axis points in 
the direction of flow. By selecting a convenient contour in the (x, z)-plane, 
prove that the circulation is non-zero so that the flow cannot be irrotational. 


Section 11:5 
11:52 The velocity field describing a fluid flow is 
Υ ΞΞ 2i + ytj + K. 
Write down the differential equations describing the streamlines and the 
particle trajectories and solve them as in Example 11-15 in the text. 


Series, Taylor’s theorem 
and its uses 


12:1 Series 


The term series denotes the sum of the members of a sequence of numbers 
{@n}, in which a» represents the general term. The number of terms added may 
be finite or infinite, according as the sequence used is finite or infinite in the 
sense of Chapter 3. The sum to N terms of the infinite sequence {an} is written 


N 
a+det+:: ‘+ ay =) an, 
n=1 


and it is called a finite series because the number of terms involved in the 
summation is finite. The so called infinite series derived from the infinite 
sequence {an} by the addition of all its terms is written 


οο 


αι +a,+ ++ ++ar++++= Dan. 
n=1 
The following are specific examples of numerical series of essentially 
different types: 


N 
(a) > n2 = 124 224.-. “τ ΝΣ, 
n=1 


in which the general term ay = 72; 


τ} ] Ϊ ] 
2 τ τυ τ 4 


in which the general term a, = I/n!; 


=e il 1 J Ι 
ὩΣ τ ὦ" μάν τι; oy 


in which the general term ay, = 1/n; 


Ae a 2r2 + | 
2 10 εὴ 


2 2η2.Ε J 
14 ἀπ 


Φ Σ 4n +. 2 


in which the general term an = (2n? + 1)/(4n + 2); 


(Ὁ) > (-—D=1-141—-14---4+(-Deig.-, 
n=1 


532 / SERIES, TAYLOR’S THEOREM AND ITS USES CH 12 


in which the general term a, = (—1)"*1, 

Only (a) is a finite series; the remainder are infinite. 

There is obviously no difficulty in assigning a sum to a finite series, but 
how are we to do this in the case of an infinite series? A practical approach 
would be to attempt to approximate the infinite series by means of a finite 
series comprising only its first N terms. To justify this it would be necessary 
to show in some way that the sum of the remainder Ry of the series after Ν 
terms tends to zero as N increases and, even better if possible, to obtain an 
upper bound for Ry. This was, of course, the approach adopted in Chapter 6 
when discussing the exponential sertes which comprises example (b). In the 
event of an upper bound for Ry being available, this could be used to deduce 
the number of terms that need be taken in order to determine the sum to 
within a specified accuracy. 

The spirit of this practical approach to the summation of series is exactly 
what is adopted in a rigorous discussion of series. The first question to be 
determined is whether or not a given series has a unique sum; the estimation 
of the remainder term follows afterwards, and usually proves to be more 
difficult. 

To assist us in our formal discussion of series we use the already familiar 


notion of the nth partial sum Sn of the series δ᾽ an, which is defined to be the 


n=1 


finite sum 
n 
Sin => Ar = A, + (ἡ + o> t+ ay. 
y=] 


Then, in terms of S,, we have the following definition of convergence, which 
is in complete agreement with the approach we have just outlined. 


DEFINITION 12-1 (convergence of series) The series ¥ ap will be said to be 


n=1 


convergent to the finite sum S if its nth partial sum S, is such that 


lim Sn = δ΄, 


n—-> 


If the limit of S, is not defined, or is infinite, the series will be said to be 
divergent. 


The remainder after ἢ terms, Rn, is given by 


Rn = Gnti t+ Ante ts * + anir tet, 


so that if {S,} converges to the limit S, then Ry = S — Sp and Definition 
12:1 is obviously equivalent to requiring that 
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lim (S — S,) = lim Ry = 0. 


Example 12-1 Find the th partial sum of the series 
ae ". ee 
3.0.2} 3" ' 


and hence show that it converges to the sum 3/2. Find the remainder after 
Ἢ terms and deduce how many terms need be summed in order to yield a 
result in which the error does not exceed 0-01. 


Solution This series is a geometric progression with initial term unity and 
common ratio 1/3. Its sum to m terms, which is the desired nth partial sum 
Sn, may be determined by a well known formula (see Problem 12-2) which 
gives 

1 — (1/3)* 
ϑῃ = -------- 

"1-1/3 

We have 


lim S, = lim : [ ΕΝ = 3/2 
nwo ne 2 cy a 
showing that the series is convergent to the sum 3/2. 


As δὴ is the sum to n terms, the remainder after ἡ terms, Rn, must be 
given by Rn = 3/2 — Sn, and so 


1 /1\"-1 
Ra =5(;] 


If the remainder must not exceed 0-01, Ry < 0-01, from which it is easily 
seen that the number n of terms needed is n > 5. The determination of Rn 
was simple in this instance because we were fortunate enough to have avail- 
able an explicit formula for Sn. In general such a formula is seldom available. 


3 
= 5 (1-0/3) 


The definition of convergence has immediate consequences as regards 
the addition and subtraction of series. Suppose Zan and Xb, are convergent 
series with sums «, 8. (It is customary to omit summation limits when they 
are not important.) Let their respective partial sums be S, = αι + a2 τ. 
+ an, Sn’ = δὲ + bo +: + - + by, and consider the series X(an + bn) which 
has the partial sum S,” = δ, + Sy’. Then 


lim Sy” = lim (Su + Sn’) 


Ti— 20 Ti-> 0 


= lim Sy + lim Sy’ = a + f, 


t—> Ὁ ti—> © 
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showing that 


οο 


> Gn + bn) = a + β. 


n=l 
A corresponding result for the difference of two series may be proved in 
similar fashion. We have established the following general result. 
THEOREM 12:1 (sum and difference of convergent series) If the series 


an and Σ by, are convergent to the respective sums « and β, then 
: δ Ρ 


n=] n= 


Dac + bn) = at B; Σ G — bn) = a — β. 


Example 12:2 Suppose that an, = (1/2)" and bn» = (1/3)", so that the 


. series involved are again geometric progressions with > (1/2)” = 2 and 


n=1 


> (1/3)" = 3/2. Then it follows from Theorem 12:1 that 
n=1 


S12" Ὁ (3}] -- ΤΣ ἀπά ΟΣ [{Π|2}» -- 9" = 112. 


Let us now derive a number of standard tests by which the convergence 
or divergence of a series may be established. We begin with a test for 
divergence. 

Suppose first that a series Lay, with nth partial sum S, converges to the 
sum S. Then from our discussion of the convergence of a sequence given in 
Chapter 3, we know that for any ¢ > 0 there must exist some integer Ν such 
that 


|Sn—-S|<e for an>N., 
This immediately implies the additional result 
| Snti — S| « ε. 
Hence, 
ete> | Sn -- 5] Ἐ| δὰ το 5 τ | 8,.1 -- 5] Ἡ | δ -- δὰ} 


= | δηε — Sn |. 
However, as Sn+1 — Sn = Gn+1, we have proved that 


ldni1|<2e for n>N. 


As e was arbitrary, this shows that for a series to be convergent, it is necessary 
that 


lim | an | =0 
R—-> οὗ 
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or, equivalently, 


lim an = 0. 


ἢ - οὐ 


οΌ 
If this is not the case then the series Σ᾽ a, must diverge. This condition thus 
7 Fon 


provides us with a positive test for divergence. 


THEOREM 12-2(a) (test for divergence) The series > an diverges if 
n=1 


lim an τῷ 0. 


n> ®© 


This theorem shows, for example, that the series (d) is divergent, because 
An = (2n? + 1)/(4n + 2), and hence it increases without bound as κα increases. 


It is important to take note of the fact that this theorem gives no information 
in the event that lim a, = 0. Although we have shown that this is a necessary 


n—--> ὦ 
condition for convergence, it is not a sufficient condition because divergent 
series exist for which the condition is true. 
Theorem 12:2(a) gives no information about either series (a) or (c) as in 
each case lim a, = 0. In fact, by using another argument, we have already 


n—> 0O 
proved that the series representation for e in (b) is convergent, whereas we 
shall prove shortly that the harmonic series (c) is divergent. Series (e) must 
also be divergent according to our definition, because a, oscillates finitely 
between 1 and —1, and also S, does not tend to any limit. 

The terms of series are not always of the same sign, and so it is useful to 
associate with the series Xa, the companion series =| a, |. If this latter 
series is convergent, then the series Lap is said to be absolutely convergent. It 
can happen that although Xa, is convergent, Σ | ay | is divergent. When this 
occurs the series Xap is said to be conditionally convergent. Now whén terms 
of differing signs are involved, the sum of the absolute values of the terms of a 
series Clearly exceeds the sum of the terms of the series, and so it seems reason- 
able to expect that absolute convergence implies convergence. Let us prove 
this fact. 


THEOREM 12-2(b) (absolute convergence implies convergence) If the series 


> | 4n | is convergent, then so also is the series > ap. 
n=1 n=1 


Proof The proof of this result is simple. Let Sp =|a1| -Ἐ [ἀ5] - - -’ 
+ | Qn | and Sn’ = ay + ag +++ + + an be thenth partial sums, respectively, 
of the series in Theorem 12-2. Then, as a; + | a; | is either zero or 2 | ar |, it 
follows that 


O< Sp + Sn’ < 2S’. 
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Now by supposition lim δι΄. = S’ exists, so that taking limits we arrive at 
N—> © 


0 <lim (Sp + Sn’) < 28". 


M+ DC 


This implies that the series with mth term an + | an | must be convergent and 


ie a) 
hence, using Theorem 12:1, that } an must be convergent. 


n=1 


Example 12.3 Consider the series 


~ Sat a 
ar ei) ; aye 
AS an = (—1)"/n!, we have | an | = 1/n!, which is the general term of the 


exponential series defining 6. Thus Theorem 12-2, and the convergence of 


συ 
the exponential series, together imply the convergence of > (—1)"/n! In 
n=0 
fact this is the series representation of I/e. 


Suppose Lb, is a convergent series of positive terms, and that La, is a 
series with the property that if N is some positive integer, then | an |< bn 
for n > N. Then clearly the convergence of £5, implies the convergence of 
> | an | and, by Theorem 12:2, also the convergence of Za,. By a similar 
argument, if for n > N, O< by <an, and Xby is known to be divergent, 
then clearly =a, must also be divergent. We incorporate these results into a 
useful comparison test. 


THEOREM 12:3 (comparison test) 


(a) Convergence test Let by be a convergent series of positive terms, and 
let Xa, be a series with the property that there exists a positive integer N 
such that 


|an)|<bn for an>N. 
Then Lay is an absolutely convergent series. 


(Ὁ) Divergence test Let Xb, be a divergent series of positive terms, and let 
Lay be a series of positive terms with the property that there exists a positive 
integer N such that 


0 < bn < an for n> N. 
Then Zaz is a divergent series. 


Example 12:4 
(a) Consider the series Σ [2 + (—1)"]/2"%. We have 


Zee 
πὰ 


| 
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and as > 3/2" = 3 5 1/2" = 9/2, the conditions of Theorem 12:3 (a) are 
n= n=1 


satisfied if we set bn = 3/2”. It thus follows that the series Ea, is convergent. 


(b) Consider the series δ᾽ (n + 1)/n?. Here we have 
n=1 


--- 


n? n 


an = =3 
n 


nN 


n+ 1 (3: I 


and as the harmonic series =1/n is divergent, the conditions of Theorem 
12-3 (b) are satisfied when we set by = I/n. Hence “a, is divergent. 


Fig. 121 Comparison between series and integral. 


A powerful test for the convergence or divergence of a series Sa, of 
positive terms follows by a comparison of the shaded rectangles in Fig. 12.1. 

Let f(x) be a non-increasing function defined for |< x < οὐ which 
decreases to zero as x tends to infinity, and let f(”) = an, where nis an integer. 
Then we have the obvious inequality 


Σ Τὴ ΞΞ | fax =< Flo 
or, equivalently, 

Σ a< f(x) dx <> ἀν. 

r=2 1 r=1 


As the right-hand side of this inequality only exceeds the left-hand side by the 
single term αι, it must follow that in the limit, the infinite series Xa, and th 
integral | 


lim | f(x) dx 


nh Ὁ 
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converge or diverge together. This conclusion may be incorporated into a 
test as follows. 


THEOREM 12-4 (integral test) Let f(x) be a positive non-increasing function 
defined on 1 -- χ < οὐ with lim f(x) = 0. Then, if a, = f(n), the series 


tr 20 


Pos 
1Mes 


dy converges or diverges according as 
1 


lim Ἰὼ dx 


n— Ὁ 


is finite or infinite. 


Corollary 12-4 (Ry deduced from integral test). Let f(x) be a positive non- 


increasing function defined on 1 < x < οὐ with lim f(x) = 0, and let ¥ an 


n> n=1 
be convergent, where an = f(n). Then the remainder Ry after N terms 


satisfies the inequality 


Κν ΞΞ a {0 dx. 


Proof The result follows at once from the obvious inequality 
N’ N’ N’ 
> a< |] fXddx< da, 
N r=N 


r=Nl 


by taking the limit as N’-—» oo. This is possible because, by hypothesis, 
Xan, is convergent so that the improper integral involved exists. 


Example 12-5 
(a) Consider the series > 1/n*, where k > 0. Then the function f(x) = 1/x* 


n=1 


satisfies the conditions of Theorem 12-4. Hence this series converges or 
diverges according as 


"dx 
lim πεῖς 
n—co Jl xk 


is finite or infinite. If k ~ 1 we have 


im [5 ( : ) lim πα ι 
ΠῚ}. =| - Sls 
Hence for 0 < k < 1 this limit is infinite, showing that the series is divergent 


for k in this range, whereas for k > 1 this limit has the finite value 1/(A — 1), 
showing that the series is convergent for k > 1. Applying Corollary 12-4 
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shows that when k > 1, the remainder Ry after N terms must satisfy the 
inequality 


Ryn < NO-*) lk = 1). 


When k = 1 we obtain the harmonic series, which must be treated 
separately. As it follows that 


"dx 
lim Ϊ — = lim logn— οὐ, 
1 XxX 


TL OO Ri—> CO 


we have proved that the harmonic series is divergent. 


(Ὁ) Consider the series > n/(1 + n?). Here we set f(x) = x/(1 + x2), so 
n=1 . 


we must examine 


dx 
Pia 
n—-co Jl 1+ x? 


Setting x? = u we find 


L = lim 3{log (1 + x2) — log 2] > o. 
Hence the series is divergent. 


Two other useful tests known as the ratio test and the nth root test may 
be derived from Theorem 12:3, essentially using a geometric progression for 
purposes of comparison. The idea involved in these tests is that a series is 
tested against itself, and that its convergence or divergence is then deduced 
from the rate at which successive terms decrease or increase. 

Suppose that La, is a series for which the ratio dp+1/an is always defined 
and that lim | dpii/dn | = L, where L <1. Let r be some fixed number 


Mt ὦ 
such that L <r - 1. Then the existence of the limit L implies that there 
exists an integer N such that 


| Qnsi| <r] an | for n>.N. 
Hence it follows that 

| an+2 ay | aQn+1 lj | an+3 | <r | aNn+2 | ie ΕΣ | ΩΝ F sonny 
and in general 

| Qvimsr | << r™ [ἀνΝ 1 [Ι. 


Thus if Ry is the remainder after N terms we have 


CO 


Rv= 2 a Σ᾽ Ἰαμ] «[ἀκαὰ[(]1 π|ν Ὁ γ2 Ὁ. Ὁ. (*) 
1 


n=N+I a= N+ 
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The expression in brackets is a convergent geometric progression because, 
by hypothesis, r < 1. As the remainder term Ry is finite, and is less than the 
sum of the absolute values of the terms comprising the tail of the series, it is 
easily seen that the series Xa, must be absolutely convergent. If L > 1 the 
terms grow in size, and the series Lay is divergent. Nothing may be deduced 
if L = 1 for then the series may either be convergent or divergent as illus- 
trated by Example 12:5(a). In that case dn+i/dn = n*/(n + 1)*, giving 
lim | @n+i/an | = 1; and the series was seen to be divergent forO<k< 1 
and convergent for k > 1. 
Expressed formally, as follows, these results are called the ratio test. 


THEOREM 12:5 (ratio test) If the series > ay is such that an τέ 0 and 


n=t 


R—> 2 
then 


(a) the series La, converges absolutely if L < 1, 
(Ὁ) the series Xa, diverges if L > 1, 
(c) the test fails if 2 = 1. 


Example 12-6 
(a) Consider the series 


© (—1)m! 
3! yent 


n® 
Then a, + 0 and 


An+1 (n + 1)!n@ aM 
= (—])2nt1 — oC = (ent lf ] - ς 
an =) (n + 1)"+1n! {π} ( ae 
Hence 
1\” 
lim 1“ -- tim i/(1 ἘΠ) = L/e, 
Nh— 00 an N— 00 n 


where the final result follows by virtue of the work of Section 3-3. Ase > 1, 
the ratio test proves the absolute convergence of this series. 


reo) 


(Ὁ) Consider the series > 1/n!. Here an = 1/n! 4 0 and 


n=1 ι 
Gn+1 πὶ _ Ϊ . | antl 
An (n+1)! nt+1 An i 


SEC 12 SERIES / 541 


Hence 
᾿ n+1 ‘ 
lim = lim = Ὁ. 
n-+co | An n+ ἢ 


and as 0 < 1 the ratio test proves the series to be convergent. 
co 
(c) Consider the series Σ᾽ 3%/n. 
n=1 


Then a, ~ 0 and 


Gn+1 _ ΓΞ Ἱ) =3/ n | anti 

Qn \n+1/\3n) — n+1) | apn 
Now 

tim |—**1| = im = 


and as 3 > 1 the ratio test proves the series to be divergent. 
a 


(d) Consider the series Σ᾽ 1/(2n + 1)2. 
n=1 


Then a, ~ 0 and 


Qn+1 = (= +1 Ῥ- Qn+1 
Qn 2n + 3 a an 
Now 
. | Ant , 2n+ 1\2 
lim | ——}| = lim = |, 
n—>ao| an n>o0 \2n + 3 


so that the ratio test fails in this case. In fact the series is convergent, as may 
readily be proved by use either of the comparison test, with bn, = 1 [n?, or 
the integral test. 


As the remainder term Ry used in the proof of the ratio test may be either 
positive or negative, the estimate (*) is equivalent to 


|Rv|<janun|(+r4+r24- . ἡ 


or, summing the geometric progression, to 


| Ry | < να] ant | 
— jr 


This simple result provides an estimate of the error if the summation is 
terminated after N terms and comprises our next result. 


Corollary 12-5 (Ry deduced from ratio test) Let the series > a, be con- 
vergent, and let the ratio test be applicable with n=1 
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n—> 


Then, if r is a number such that L <r < 1, the remainder Ry after N terms 
is such that 


Let us use Example 12:6 (a) to illustrate this and to compute | Rs |. We 
have L = \/e and, ase = 2:7182. . ., we could take r = 0-5. Then 1/(1 — ἡ) 
= 2, whence 


Hence | Rs | < 48/625. 

In the so called nth root test, appeal is also made to the geometric pro- 
gression to prove convergence. Suppose the series Ud» is such that 

lim Na/| an | ἘΞ Τ᾽, 

πα) 
and that L < 1. Then if r is some definite number such that L < r < 1, the 
existence of the limit implies that there exists an integer N such that 


M/|an|<r for n>N. 


Hence | an | <r” for n > N. Thus, as with the ratio test, the remainder 
after N terms may be overestimated by the sum of the absolute values of the 
remaining terms, and the result still further overestimated in terms of 
| av+1 | and a geometric progression with common ratio r. As r < 1 this re- 
mainder is finite, thereby establishing that La, is absolutely convergent. If 
L > 1, then successive terms grow and the series is divergent. As with the 
ratio test, the mth root test fails when L = 1, for then Lay, may be either 
convergent or divergent. Stated formally we have: 


oO 
THEOREM 12-6 (uth root test) If the series > a, is such that 


n=1 


lim "4/| an| =L, 


nh—-* © 
then 


(a) the series Xap is absolutely convergent if L < 1, 
(b) the series Xa, is divergent if L > 1, 
(c) the test fails if 2 = 1. 


Example 12-7 
(a) Consider the series 
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2 nk \% 
Pa (= 4 ᾿ ᾿ 


where k is a constant. Then 


nk \" 
an = ( ) and = lim 4/| a» | = lim 


k 
3n + 1 ene geen te 1 0.93: 


Thus the nth root test shows that the series will be convergent if k < 3 and 
divergent if k > 3. It fails if k = 3, though Theorem 12-2 then shows the 
series to be divergent. 


(b) Consider the series Σ π[25, 

Then ap, = n/2" = | an |, and "4/| dn | = ἔβγη. Taking logarithms we find 
log ["4/| @n |] = log 4 + “log n. 

Now by Theorem 6.4 (Ὁ) we know that lim (log n)/n = 0, so that 
lim log [*4/| an |] = log 4, 

whence 


lim */n = 1. 


n— o 


As ὁ < | the test thus proves convergence. In this instance it would have been 
simpler to use the ratio test to prove convergence. 


If Xa, is convergent by the mth root test, then we have seen that a number 
N exists such that | dn | <r” for n > N, where 0 <r < 1. Hence we have 


Κντ > an<|Rvn|< Σ᾽ |al< Σ rH > 
n=N+1 n=N+1 n=N+1 l—r 
and so 
γΝΈΙ 
| Rw |< 
l—r 


We express this overestimate of the remainder term as a corollary to the nth 
root test. 
οὺ 
Corollary 12-6 (Ry deduced from nth root test) Let > ἀμ be convergent by 
the nth root test with δι 
lim 2 χ] an | = L. 


n-> CD 
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Then if r is a number such that L <r < 1, the remainder Ry after N terms 
is such that 


We illustrate this result by obtaining an estimate for the remainder after 
three terms of the convergent series of Example 12-7 (b). In that case L = 4 
so that we must choose r such that } <r < 1. If we select r = 2, then 


i= 
π΄ το 1536 
Had r been chosen closer to the value 4, then a sharper estimate would have 


been obtained. Thus by taking r = 9/16 it follows that 


6561 
Rio. 
| 31S 586 72 


For our final result we prove that all series in which the signs of terms 
alternate, whilst the absolute values of successive terms decrease monotonic- 
ally to zero, are convergent. Such series are called a/ternating series and are 
of the general form 


> (—1)"tla, = αι — a2 + a3 —a4g+° ros 
n=l 


where ay > 0 for all ἡ. 

To prove our assertion of convergence we assume αι > a2 > a3 >" ° +, 
and lim a, = Ὁ and first consider the partial sum Se, corresponding to an 
even number of terms 2r. We write Se, in the form 


Sor = (a1 — a2) + (a3 — a4) + + + + + (Ger-1 — Ger). 
Then, because αι > az > a3 >°- -, it follows that So, > 0. By a slight 
rearrangement of the brackets we also have 

Sor = a1 — (a2 — a3) — (a4 — Ω5) — + + > — (Ger-2 — Ger-1) — er, 


showing that as all the brackets and quantities are positive, Sz, < ai. Hence, 
as Ser is a bounded monotonic decreasing sequence, we know from Chapter 
3 that it must tend to a limit S, where 


0 -- $< a. 


Next consider the partial sum S2,+1 corresponding to an odd number of 
terms 2r + 1. We may write Seri+1 = Ser + der+i1. Then, taking the limit of 
Ser+1 we have 

lim Ser41 = lim Ser + lim @er41 = (δ, 


7—> CO 7—> Ὁ 7- οὐ 
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because by supposition lim az,;; = 0. Thus both the partial sums Sa, and 
the partial sums S2,+1 tend to the same limit S. Hence we have proved that 
for n both even and odd 

lim Sn = S, 
thereby showing that the series converges. 


feo) 


THEOREM 12-7 (alternating series test) The series δ᾽ (—1)"*+1a, converges 
1 


nti 


if dn  Ο and 4.4.1 < an for all n and, in addition, 


Example 12-8 
(a) Consider the alternating series 


in which the absolute value of the general term a, = 3%. Then, as it is true that 
An+1 < dn and lim ἀῃ = 0, the test shows that the series is convergent. 
(b) Consider the alternating series 


> (—1)*+2 iat Ve — “2 a 34/2 + 44/2 mer: 54/2 +-- . 

n=1 
in which the absolute value of the general terma, = "+14/2. Nowitis true that 
Qn+1<an, but lim ap = 1, so that the last condition of the theorem is 
violated rendering it inapplicable. Theorem 12-2 shows the series to be 
divergent. 

The form of argument that was used to show 0 < Sor < ai also shows 
that 


ςο 


0< Σ (—1)" 1 a < aomat 
r=2m+1 


and, by a slight modification, that 


aon < > (—1)"+1a, < 0. 


r=2m 
oa) 


As Rom = Σ᾽ (—1)*a, is the remainder after an even number 2m of terms, 
r=2m+1 
co 


and Rem-1 = > (—1)’ar is the remainder after an odd number 2m — 1 of 


r=2m 


terms, it follows that if N is either even or odd, then 


0< | Ry | < ay. 
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Expressed in words this asserts that when an alternating series is termi- 
nated after the Nth term, the absolute value of the error involved is less than 
the magnitude ayi1 of the next term. 


Corollary 12-7 (Ry for alternating series) Ifthe alternating series ¥ (—1)"+1ay, 
n=1 
converges, and Ry is the remainder after N terms, then 
0< | Ry | « an4+1. 


Using the convergent alternating series in Example 12-8 (a) for purposes 
of illustration we see that a, = 1/2", and so the remainder Ry must be such 
that 


0<|Ry | < 1/2441, 


For example, termination of the summation of this series after five terms 
would result in an error whose absolute magnitude is less than 1/64. 

A calculation involving the summation of a finite number of terms is 
often facilitated by grouping and interchanging their order. Although these 
operations are legitimate when the number of terms involved is finite, we 
must question their validity when dealing with an infinite number of terms. 
Later we shall show that the grouping of terms is permissible for any conver- 
gent series, but that rearrangement of terms is only permissible in a series 
when it is absolutely convergent, for only then does this operation leave the 
sum unaltered. 

An example will help here to indicate the dangers of manipulating a series 
without first questioning the legitimacy of the operations to be performed 
upon it. Consider the alternating series 


Ἐξ πε τεπεε: 


which is seen to be convergent by virtue of our last theorem, and denote its 
sum by S. Then we have 


S=1-}+4-4+4-4+ 


or, on rearranging the terms, 


sa)— 


hd) 
~~ 
— 


19,» 
n 


This can only be true if S = 0, but clearly this is impossible because 
Corollary 12-7 above shows that the error in the summation after only one 
term is less than 3 and therefore S is certainly positive with 4 < δ᾽ < 1. 

What has gone wrong. The answer is that in a sense we are ‘robbing Peter 
to pay Paul’. This occurs because both the series &1/(2” + 1) and the series 
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=1/2n from which are derived the positive and negative terms in our series are 
divergent, and we have so rearranged the terms that they are weighted in 
favour of the negative ones. Other rearrangements could in fact be made to 
yield any sum that was desired. In other words, we are working with a series 
that is only conditionally convergent, and not absolutely convergent. It 
would seem from this that perhaps if a series Lay is absolutely convergent, 
then its terms should be capable of rearrangement and grouping without 
altering the sum. Let us prove the truth of this conjecture, but first we prove 
the simpler result that the grouping or bracketing of the terms of a convergent 
series leaves its sum unaltered. 

Suppose that Za, is a convergent series with sum S. Take as representative 
of the possible groupings of its terms the series derived from Σαῃ by the 
insertion of parentheses (brackets) as indicated below: 


(αι + a2) + (a3 + aa + a5) + ας + (a7 + 48) - - "". 


Now denote the bracketed terms by δι, be, . . ., where δ: = ay + ag, 
be = a3 + a4 + ἀ5,. . ., so that we have associated a new series Lby with 
the original series Say. If the nth partial sums of Xa, and Σδη are S, and Sn, 
respectively, then the partial sums S’1, S’o, S’3, S’4,... οὗ Σδῃ are, in 
reality, the partial sums S2, S5, Se, Ss,. . . of Zan. AS Udn Is Convergent to S$ 
by hypothesis, any subsequence of its partial sums {S;,} must also converge 
to S. In particular this applies to the sequence Sz, Ss, S¢, Ss, . . ., derived by 
the inclusion of parentheses. Hence Xb» is also convergent to the sum S, 
which proves our result. 

We now examine the effect of rearranging the terms of a series. Let Za, 
be absolutely convergent so that = | a, | must be convergent, and let Σ δ, be 
a rearrangement of ay. Then, as the terms of =| b, | are in one-to-one 
correspondence with those of Σ [ἀμ |, it is clear that =| by | = =| an |, 
from which we deduce that Σδῃ is also absolutely convergent. 

Next we must show that La, and Xb, have the same sum. If Sn 15 the nth 
partial sum of 2a, which has the sum S, then by taking ἢ sufficiently large we 
may make | S, — S| as small as we wish; say less than an arbitrarily small 
positive number e. Now let δ΄, be the mth partial sum of Σδῃ. Then, as S;, 
contains the first n terms of Zap, with their suffixes in sequential order, by 
taking m large enough we can obviously make Sm, contain all the terms of Sy 
together with m — n additional terms ap, ag, . . ., ar, where n < p<q< 
eo 8 6 a r. 


Hence we may write 
δι = Sn + dp + dg +--+ +a, 
whence 
S'm — S = δὴ —S+ ap + αη -Ἐ - + +a. 


Taking absolute values gives 
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| S'm — S|<|Sn—S|\+|ap|+)ag|+- τ" Ὁ [αν]. 
Now, ἡ was chosen such that | S, — S| < ε, so that 
| S’m — S| Set |ap|+lagl +--+ [arl- 


However, the remaining terms on the right-hand side of this inequality all 
occur after a, in the series Zan, and as | Sp» — S| < e, it must follow that 
their total contribution cannot exceed «, and thus 


| Ss nH mas S Ϊ « 2ε. 
This shows that the mth partial sum of 2b, converges to the sum S, so that 


rearrangement of the terms of an absolutely convergent series is permissible 
and does not affect its sum. 


ie @) 


THEOREM 12:8 (grouping and rearrangement of series) If the series > an 
n=1 
is convergent, then parentheses may be inserted into the series without affect- 


eo 
ing its sum. If, in addition, the series } ay is absolutely convergent, then its 
n=1 
terms may be rearranged without altering its sum. 


Example 12-9 

(a) Consider the series 
2 Ι 
p> m(m + 1) 


which is easily seen to be absolutely convergent by use of the comparison 
test with bm = 1/m?. As absolute convergence obviously implies convergence, 
the first part of Theorem 12-8 asserts that we may group terms by inserting 
parentheses as we wish. So, using the identity 


we find for the th partial sum δ. the expression 
ΗΝ | 1 
Sn = πύκα πα ἢ ὦ 
Σ (7 m + ) 


Now successive terms in this summation cancel, or telescope as the process 
is sometimes called, leaving only the first and the last. This is best seen by 
writing out the expression for S, in full as follows: 


= (1-14 (L-Ya-.-4 (4-4 4(- Ξ 
ἘΠ 1.1....0 2. 3 n—1 on n nti 
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Hence, if the sum of the series is S, we have 


Ι 
S = lim Sy, = lim ji-——| = 1. 


ἢ —» GD ἤπερ + ] 
(Ὁ) Consider the series 
] Ι 


Ϊ 
2. .3 


Ϊ ᾿ 1 
+ 55 ΞΡ 3: ΞΡ 53 + 33 
which can be shown to be absolutely convergent by an extension of the nth 
root test. (See Problem 12-14.) The second part of Theorem 12:8 is applicable, 
so that we may rearrange terms and, denoting the sum by S, we obtain 


The use of parentheses in a divergent series can sometimes produce a 
convergent series and, conversely, when attempting to alter the form of 
a convergent series a divergent series may sometimes be produced 
inadvertently. 

For instance, taking Example 12-9 (b), we could have written 


5: Ι Σ(2: - ἘΞ Ὁ} 
n 


2) ney μς ee 


“nti on+2 
=2 n ae 
τ 1 O° 1 
aay ττ Sy —-2. 
2 n 2 n 


which we know to be an incorrect result. The error is, of course, contained in 
the first line in which we attempt to equate an absolutely convergent series 
with the difference between two divergent series. 


12:2 Power series 


Up to now we have been concerned entirely with series that did not contain 
the variable x. A more general type of series called a power series in (x — xo) 
has the general form | 


cO 


> 4n(x — χορ)" = ao + a(x — x0) + a(x — x9)? 4° - ᾿, (12-1) 
n=0 
in which the coefficients ao, a1,. . .,@n,. . . are constants. When x is assigned 


some fixed value ξ, say, the power series Eqn (12-1) reduces to an ordinary 
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series of the kind discussed in the previous section, and so may be tested for 
convergence by any appropriate test mentioned there. 

For simplicity we now apply the ratio test to series Eqn (12:1), allowing x 
to remain a free variable, in order to try to deduce the interval for x in which 
the series is absolutely convergent. If xn(x) is the absolute value of the ratio 
of the (n + 1)th term to the mth term as a function of x, we have 


An+i(x -- ΧΟ) = | an+1 


in 


on(x) = | x — Xo |. 


An(X oe Xo)” 


Now for any specific value of x, the ratio test asserts that the series will 
be convergent if lim an(x) < 1, whence we must require 


N+ © 


; Qn4+1 
lim 


N+ © 


\x—xol<l. 


iL 


Thus the largest value r, say, of | x — xo | for which this is true is given by 


Υ = lim 


Ti—> CO 


(12-2) 


aAn+1 


provided that this limit exists. 
The inequality 


|x—xol<r (12-3) 


thus defines the x-interval (xo — 7, xo + 7) within which the power series 
Eqn (12:1) is absolutely convergent. For x outside this interval the ratio test 
shows that the power series must be divergent. (See Fig. 12:2.) The interval 
itself is called the interval of convergence of the power series, and the number 
r is called the radius of convergence of the power series. The interval of con- 
vergence has been deliberately displayed in the form of an open interval 
because the ratio test can offer no information about the behaviour of the 
series at the end points. In fact the power series may either be convergent or 
divergent at these points. | 


Absolutely convergent 


Divergent ——— Divergent 
“eggeacee ae 4. Ess se © 
χὺ -- Χο Xo+r 


Fig. 12:2 Interval of convergence. 
The radius of convergence of a power series can also be deduced from the 


nth root test, when it is easily seen that 


r = lim ———— (12-4) 
n> a mi an Ϊ 


provided that this limit exists. 
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DEFINITION 12-2 (radius of convergence of power series) The radius of 
va) 


convergence r of the power series δ᾽ an(x — xo)” is defined either as: 
n=0 


: an 
ry = lim ’ 
It—> CO an+1 
OT 
1 


r = lim ———— 


"τῷ 4/1 da | 


provided that these limits exist. 


Example 12-10 


(a) Let us show that the series for the exponential function is absolutely 
convergent for all real x. We have 


x2 x3 x 
a Sr ee eee re 2; 


3! n! 


in which the general term a, = 1/n!, 


Now 
n 1)! 
Gn+1 Ht. 
so that 


r=lim(n+ 1) -- ~., 
n— © 
We have thus proved that the power series for e is absolutely convergent for 
all real x. This was an example of a power series with an infinite radius of 
convergence. 


(b) Consider the series 
Χο ne ye 
XxX — en — ao 


i i με, 


which reduces to the illustrative example following Corollary 12:7, when 
x = 1. We shall see later that this is the power series expansion of log (1 + x). 
Then, again applying limit (12-2), we have an = (—1)"*1/n, and so 


= τ} 
- υ 


Thus we have 


an 


Qn+1 
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r= lim (1: =1. 


N-—> © n 


Hence, the series is absolutely convergent for | x | < 1. As we already know 
the series is convergent for x = 1, and divergent for x = —1 for then it 
becomes the harmonic series with the signs of all terms reversed, we have 
proved that the power series for log (1 + x) is absolutely convergent for 
—1<x<l. This was an example of a power series with radius of 
convergence unity. 


(c) Consider the series 
L+ xt (2x)? + Bx +--+ (xyes, 
then dy, = n” so that 
I 
"a/| an | 
Hence, from Eqn (12-4), 


ad 
on 


r = lim ᾿ = 0. 
n—>ao ἢ 
This series has zero radius of convergence and so is absolutely convergent 
only when x = 0. That is to say this power series has a finite sum, and so is 
convergent, only at the one point x = 0 on the real line. 

As a power series is yet another example of the representation of a func- 
tion of the variable x, it is reasonable to enquire how we may differentiate 
and integrate functions that are so defined. For simplicity we will take xo = 0, 
and work with the power series about the origin 


10) = Σ aux (12:5) 


This is no restriction because Eqn (12-1) can be brought into this form by 
shifting the origin by means of the change of variable t = x — xo. We will 
assume that Eqn (12:5) has a radius of convergence r > 0. 

Intuition suggests that the derivative of f(x) could be obtained by differ- 
entiating the right-hand side of Eqn (12:5) term by term and, similarly, that 


Ϊ f(t)dt could be obtained by term by term integration. However, extreme 
b 3 


caution must be exercised in such matters for we have already seen that what 
is legitimate for the sum of a finite number of terms is not necessarily legiti- 
mate for an infinite series. Furthermore, we are now dealing with an infinite 
series of functions, and not just an ordinary series. In fact we shall show that 
termwise differentiation and integration of a power series is always per- 
missible when x lies within the interval of convergence —r < x <r of Eqn 


(12:5). 
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The justification of termwise differentiation that we now offer is perhaps 
the most subtle and difficult proof to be found in this book. It has been in- 
cluded because differentiation of functions defined by a power series is 
fundamental to many branches of mathematics. In fact we have already 
employed termwise differentiation when deriving the series representation for 
e* in Chapter 6, and we shall use it again when discussing differential equa- 
tions. The proof of this result also serves to indicate how any study of the 
subject beyond this level must, of necessity, involve the notion of uniform 
convergence, This aspect of the proof is not emphasized here, since it is 
beyond the scope of a first account. 

Our object will be to prove that the function 


F(x) = Σ μαρχῆη (12:6) 


is the derivative of the function f(x) of Eqn (12:5), that is to say that f’(x) = 
F(x). 


First notice that Eqns (12:5) and (12-6) have the same radius of 
convergence. This follows because, by hypothesis, 


lim 


n— © 


= I, 


An+1 
and the ratio of the mth to the (m + 1)th coefficient of Eqn (12-6) is 


Mam|(m + lami, 


whence 
; Mam : m : Am 
lim | -————_—'| = lim | —_—_] . lim Ξε "ῦ. 
mM -ρ 00 (m + lam+1 m>o\M-+1) m+0|Qm+1 


Next, if x and x + A are points in the interval of convergence, form the 

difference quotient 
h) --- oO nm xn 

T(x + ἢ) - f(x) aS ὡς ( + > x ) (12-7) 
The grouping of terms on the right-hand side is permissible because of the 
absolute convergence of the power series for f(x) in —r <x <r. 

Then, applying the mean value theorem for derivatives (Theorem 5-12) to 
the general term on the right-hand side of Eqn (12-7), we have 


(x - Μὴν — x” = hné, 2-1, 


h n=0 


where x < én <x +A forn=1,2,.. .. Thus we arrive at the result 


were) ae Σ massa (12-8) 
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Then, as Eqns (12:5) and (12:6) have the same radius of convergence, we may 
consider the difference between Eqns (12-6) and (12-8), again using the fact 
that absolute convergence permits rearrangement of terms to give | 


F(x) rf) -Ξ Σ μαρ(χτ — E,%—1), 
n=2 
or 
70) LEA DSO) ΡΥ 
n=2 


Let us again use the mean value theorem for derivatives to obtain the result 
xml — E,xm-l = (n — Wx — ξ,)η," 5, 


Where x < mn <&n. Then, as | x — ἔξ, | < | A |, we have 


ὡς f+ =f 


<|Al Qa] an| nar, (12-9) 


ior =r =r. 

Now the form of argument used to prove that the power series Eqn (12-6) 
has radius of convergence r, also proves that the series on the right-hand side 
of this inequality has radius of convergence r. So, allowing ἢ to tend to zero, 
as the sum of the series is finite the right-hand side of Eqn (12-9) also tends 
to zero whilst the difference quotient approaches f’(x). Hence we have proved 
our result. The difficult part of this proof was in showing that the right-hand 
side of Eqn (12:9) can be made arbitrarily small independently of x in the 
interval of convergence. This is the property of uniform convergence 
mentioned in Chapter 3. 

As differentiability implies continuity we have, as an incidental result, 
proved that a power series is continuous within its interval of convergence. 
A more direct proof is indicated in Problem 12-19 at the end of the chapter. 

The termwise integrability of power series is easier to prove. Denote by 
H(x) the series 


H{(x) = 
(x) 2 + ] 
which is obtained by termwise integration of Eqn (12-5). That is 


H(x) - | {(ἢ dt. 
0 
Now the ratio of the mth to the (n + 1)th coefficients of Eqn (12-10) is 


(n + l)adn-1/nan, whence 
] 
= lim ( = lim 
n> Ὁ Hv R— © 


an 


xnti, (12-10) 


(a+ 1) an 
n an 


an—-1 


lim 


N+ @ 


an 
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This shows that the power series Eqn (12-10) also has radius of convergence 
r, We have just established that a power series is differentiable for x within 
its interval of convergence, so that H’(x) = f(x) for —r <x <r. Thus by 
the fundamental theorem of calculus 


[fod = He) — HO = HO, 


which was to be proved. Let us collect together these results into the form of 
a theorem. 


THEOREM 12-9 (differentiation and integration of power series) Let the 
function f(x) be defined by the power series 


70) - Sant 


with radius of convergence r > 0. Then, within the common interval of 
convergence —r<x<r, 


(a) f(x) is a continuous function; 


(Ὁ) f’(x) = Σ NaAnx"-1; 


1 


x oO An 
c t)dt = ΧΡῸΣ 
ὦ) |" fod = -ὶ 
Example 12-11 Find the radius and interval of convergence of 
0D xn 
I(x) => no) 


n=1 
Deduce f’(x) and find its interval of convergence. 
Solution The nth coefficient a, of the power series for f(x) isdn = 1/n(n + 1), 
and so the radius of convergence r is given by 


n-+2 
n 


= 1. 


= lim 


n—- © 


r=ilim 


n— © 


An+1 


To specify the complete interval of convergence it remains to examine the 
behaviour of the power series at the end points of the interval ~1 < x < ]. 

The series may be seen to be convergent at x = 1 by using the comparison 
test with by = 1/n?. When x = —1 the series becomes an alternating series 
and is seen to be convergent by Theorem 12-7. Thus the complete interval of 
convergence for f(x) is -l<x<l. 

Under the conditions of Theorem 12-9 (b) we may differentiate the power 
series for f(x) term by term within —1 < x < 1, so that 
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co xn-l 
δ ye 2. nH + 1 
To specify the complete interval of convergence for this new series which, by 
Theorem 12:9 (b), is certainly convergent in —1 < x < 1, we must again 
examine the end points of the interval —1 < x < 1. The series for f'(x) 
becomes an alternating series when x = —1, and is convergent by Theorem 
12-7. At x = 1 it becomes the harmonic series, and so is divergent. The com- 
plete interval of convergence for f’(x) is thus —1< x < 1. The effect of 
termwise differentiation has been to produce divergence of the differentiated 
series at the right-hand end point of an interval} of convergence at which 
f(x) 1s convergent. 


Example 12-12 Find the power series representation of arctan x by 
considering the integral 


[ dt 
arctan x = . 
o l+f 


Deduce a series expansion for jz. 


Solution An appucAnon of the Binomial Theorem to the function (1 + a)7} 
gives the result 
Ι 
l+a 
for —1 <a < 1. Setting a = 2 we arrive at the power series representation 
of (1 + #?)-1, 
Ι 
1+? 


The conditions of Theorem 12-9 (c) apply, and we may integrate this power 
series term by term to obtain 


=l—ata*—a@t+tai-::., 


sz J] — 724 74 — 764 78 —- -., (A) 


x 
arctan x = |’ an fuera σα βου λα 


ΟΥ, 
χϑ xd? 

arctan xX = χ --π--:Ξ ςπιπ esa tf. B 

χεεχ--- Ἐπ 7 + (B) 

This is the desired power series for arctan x and by the conditions of Theorem 

12-9 (b) it is certainly convergent within the interval —1 < x < 1, which is 
the interval of convergence of the original power series Eqn (A). 

At each of the end points x = +1 of this interval, the power series Eqn 

(B) becomes an alternating series which is seen to be convergent by Theorem 
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12-7. Hence the interval of convergence of the integrated series Eqn (B) is 
—l1<x< 1. Using the fact that arctan 1 = jn, we find 


n= 1-42 +4-F4 


12:3 Taylor’s theorem 


So far we have discussed the convergence properties of a function f(x) which 
is defined by a given power series, Let us now reverse this idea and enquire 
how, when given a specific function f(x), its power series representation may 
be obtained. Otherwise expressed, we are asking how the coefficients a» in 
the power series 


I(x) -Σ Anx” (12-11) 


may be determined when f(x) is some given function. 

First, by setting x = 0, we discover that 710) = ao. Then, on the assump- 
tion that the power series Eqn (12-11) has a radius of convergence r > 0, 
differentiate it term by term to obtain 


οΌ 


f(x) = 2 παρ χη, (12:2) 


for --ὐ τς χ «ΚΓ. 
Again setting x = 0 shows that f’(0) = αι. Differentiating Eqn (12-12) 
again with respect to x yields 


fo= Σ n(n --- 1)αηχη- 2, | | (12-13) 
n=2 


from which we conclude f’(0) = 2!ae. 
Proceeding systematically in this manner gives the general result 


f™x) = * mm — 1)... (σι τα ἡ + 1)anx™, (12:14) 


so that f(™(0) = nian. 
Thus the coefficients in power series Eqn (12-11) are determined by the 
formula 


ἤ: 


for n > 1 and ao = [(0). 
Substituting these coefficients into Eqn (12-11) we finally arrive at the 
power series 


2 n 
IQ) =fO+POtF/O+ + [fMO+- >. ΑΣ216) 
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The expression on the right-hand side of this equation is known as the 
Maclaurin series for f(x), and it presupposes that f(x) is differentiable an 
infinite number of times. To justify the use of the equality sign in Eqn (12:16) 
it is, of course, necessary to test the series for convergence to verify that its 
radius of convergence r >" 0, and to show that | f(x) — Sn(x) | > Oasn > οὐ, 
where S,,(x) is the sum of the first 1 terms of the Maclaurin series. We shall 
return to this matter later. 

To transform Eqn (12-16) into a power series in (x — xo) we set x = Xo 
+ hand let f(xo + h) = $(h). Then $'(h) = f’(xo + Δ), $"(h) = f"(xo + ἢ), 

. PMA) = f'™(xo + A), . . .. It thus follows that 6” (0) = f™(xo) for 
n iS 1 and ¢(0) = f(xo). The Maclaurin series for (hf) is 


h2 hyn 
Gh) = φ(0) Ὁ ἡφ΄(0) Ἐ τ᾿ PO Ἔ΄ “τὶ dPO+:- 


or, reverting to the function /, 


flo) = flee) τα — x0)f (eo) + S27 ἔσο +> 


,& _ f(x) too. (217 


Expressed in this form the expression on the right-hand side is called the 
Taylor'series for f(x) about the point x = xo. 


Example 12:13 Find the Maclaurin series for log (1 + x) and log (1 — x). 
Deduce the expansion for log [(1 + x)/(1 — x)]. 


Solution — f(x) = log (1 + x) we find 
(—1)*1(7 — 1)! 
nae, “nO 


{0 Ξ τιν ΤΟ =e ον ΚΓ = 
and so 
f'™O) = (-—)*I'a@ -- YI 
for n >1 and f(0) = 0. Combining this expression for f(™(0) with Eqn 
(12-16) gives for the Maclaurin series for log (1 + x), 
χ xt xt 


log(l + x)= x-— > 3 4 


This has already been examined for convergence in Example 12:10 (b) and 
found to be absolutely convergent in the interval —1 <x < l. 
In the case of the function log (1 — x) the same argument shows that 


f™O) = -τῶι -- 1)! 
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for n > 1 and f(0) = 0, so that the Maclaurin series for log (1 — x) has the 
form 


This can readily be seen to have —1 < x < 1 for its interval of convergence. 
Using the fact that log{(l + x)/V1 — x)} = log( + x) — log (1 — x) 
gives the desired result 


1 3 5 7 
log ( Fe)aafea te Sass} 


l—x 3 


for —1 =< x ΕἼ, | 

Strictly speaking, we are not yet entitled to use the equality sign between 
the function and its Maclaurin series, as we have not yet established the con- 
vergence of the mth partial sum of the series to the function it represents. We 
will do this later. 


Example 12-14 Use Taylor’s series to express the polynomial 
P(x) = χ' + 3x8 + x2 4+ 2x4 1 


in terms of powers of (x — 1). 


Solution To utilize the Taylor series in Eqn (12:17) we must set xo = 1 and 
f(x) = P(x). Then a simple calculation shows that 


P(1)=8 ΡΠ) 3 175, P’G) =32, Ρ (() = 4, P41) = 24 and 
P™(])=0 for n > 5. 


Hence we arrive at the finite power series 


(χ -- DF (x — 1° (x — 1)* 
τς anne! ὦ 


P(x) =8+(x—1).17+4+ 32 +——— .424 


or 
P(x) =8 + 17x — 1) + 16(x — 1)? + Τὰ — 133 + (x — 1 


The use of the equality sign is fully justified here since we are dealing with a 
finite power series. 

It can happen that the derivatives of a function f(x) are not defined at 
x = 0 so that its formal Maclaurin series expansion cannot be obtained. In 
this case, provided the function is infinitely differentiable at the point x = xo, 
then f(x) may be expanded in a Taylor series about that point. Such a case is 
discussed by the following simple example. 


Example 12:15 Derive the nth derivative f(x) of the function f(x) = 
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x log x, and show that f™(0) is not defined. Deduce the Taylor series 
expansion of f(x) about the point x = 1. 


Solution Direct differentiation shows that f(x) = 1 + log x, f(x) = 1/x, 
f(x) = --ἰἰχῦ, fC) = 21/23, 7 0.) = —3!/x4,. . ., and in general 


(—1)"(2 — 2)! 


xn-l 


fx) = 


for n > 2. Hence it is clear that f‘(0) is not defined for any n. However, the 
numbers f‘™(1) are defined for all n and f(™(1) = (—1)"(n — 2)! for n > 2 
and f(1) = 0, f(1) = 1. The Taylor series for x log x can now be obtained 
from Eqn (12:17) by making the identification xo = 1 and then using the 
derivatives f‘(1) which have just been computed. We find 


α-πὴ απ απὸ 6α-ῦϑ 
1.2 2.3 3, 45 


which is the desired result. Again, we have used the equality sign without first 
showing that the nth partial sum of the Taylor series converges to x log x as 
n— o. 

Regarding this as a power series in the variable t = (x — 1) we find that . 
the coefficient a, of the power 15 is an = (—1)"/n(n — 1), whence the radius 
of convergence 


xlogx=(x—1) + 


tees, 


n(n + 1) 
(n — 1)n 


an 


= lim 


nm-> 3 


r = lim 


n—> © 


QAn+1 


The power series is thus absolutely convergent in the interval -I « { « Ϊ 
or, equivalently, in Ὁ < x < 2. The series is convergent when x = 2, because 
then it becomes an alternating series. It is also convergent when x = 0 by 
comparison with the series with the general term b, = 1/n*. In fact we can do 
better than this when x = 0, for then we can actually sum the series. Aside 
from the first term, which becomes —1, the sum of the remaining terms must 
be +1 by virtue of Example 12-9 (a), showing that if the equality sign may be 
believed, then 

lim (x log x) = 0. 

a0 
This is encouraging, because it is in agreement with the result which can be 
obtained from Theorem 6-4 (b) by replacing x by 1/x. This would strongly 
suggest that our series is in fact equal to x log x in the complete interval of 
convergence 0< χ < 2. 

We have attempted to emphasize that although we have indicated how a 
Maclaurin or Taylor series may be associated with a function f(x) that is 
infinitely differentiable, the general question of just exactly when the series 
is equal to the function with which it is associated still remains open. To 


| 
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indicate that an infinitely differentiable function need not be represented by 
its Maclaurin series at more than a single point, despite the fact the series is 
convergent for a// x, we examine the function (see Problem 6:10) 


A(x) el" forx <0 
x) = | 
0 for x = 0. 


This function is easily seen to be infinitely differentiable, and to be such 
that f(0) = 0 for all π. The Maclaurin series for f(x) is thus 


fo) =0+0404--,, 


which is clearly convergent for all x, yet it is only equal to the function f(x) 
at the single point x = 0, Such behaviour is quite exceptional, yet the fact 
that it is associated with a seemingly simple function justifies the caution 
with which we must approach the question of equality between a function 
and its power Series expansion. 

On occasions, the computation of the nth derivative f(x) is ameuted 
by employing Leibnitz’s theorem as we now illustrate. 


Example 12:16 If f(x) = cos (k arccos x), and f(x) denotes the nth 
derivative of f(x), show that 

(1 — x2) f(x) — Qn + IDaxf™DOX) — — k2)fn(x) = 0, 
forn =0,1,. . ., where f(x) = f(x). Deduce the Maclaurin series for f(x). 


Solution As f(x) = cos (k arccos x), it follows by differentiation that 


k sin (kK arccos x) 


νί — x?) 


—k? cos (kK arccos x) 
I — x? 


ΤΟ) = and f"(x) = 


xk sin (k arccos x) 
(1 — x2)3/2 


A little manipulation shows that f(x) satisfies the differential equation 


(l= x2)f"(x) — af (x) + f(x) = 0 
or, 
( — x2) f(x) -- xf M(x) + K2f Ox) = 0. 


Now differentiating this equation n times, and using the symbolic differ- 
entiation operator D, gives 


πᾳ — x2)f 2x) — xf (x) + k2f'(x)] = 0 
or, 
δη( — x2) fx)] — D™[xf(x)] + Dr[k2f(x)] = 0. 


Whence, employing Leibnitz’s theorem (Theorem 5-16), this becomes 
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nn — 


J 
a (-2F@) 


—xf D(x) — nf (x) + k2f'™(x) = 0, 


(1 = A) FAT () +. n(—2x) f'"t(x) Ἔ 


showing that 
([ — x2) fim+2(x) — (2n + 1)xf D(x) — (ὧδ — 2) f(x) = 0. 


This is a differential equation, but setting x = 0 it reduces to a recurrence 
relation for f‘™(0): 


f'2(0) = (n? — k2)f'™(0) for n = 0, 1, 2, 


As f'(0) = f(0) = cos (k arccos 0) = cos ($k7) and f(0) = (0) = 
k sin (k arccos 0) = k sin (4k7), we have 


f'2(0) = —k?f(0) = —k? cos = : 
f(0) = (22 — k®)f2(0) = —k2(22 — ΚΞ) cos = 


FO) = (δ — KY SO) = —k222 — k2(42 — Κ2) cos = 
and 


f(0) -Ξ (12 — k*) f')0) — και — k2) sin = 
£0) = (88 — kf) = χα! — K2B2 — 3) sin = 


fPO) = 63 — k*)f'9(0) = k(1? — k*)(3? — k2)(5? — k?) sin δὲ 


and so on. 
The general expressions are 


f2m-D0) = k(12 — k?)(32 — k2). . . [2m — 3)2 — k2] sin = 


fm) = —k2(22 — k2(4? — k2), . [(2m — 2)? — k2] cos = 


from which we conclude that the Maclaurin series for cos (k arccos x) has 
the form 
kn x? ἀπ 


ΚΞ cos --- 


k 
cos (k arccos x) = Cos > + xk sin ey 2 


= kl? = 2) sin -------- - ἘΞ 
+ 31 καὶ  --- k?) sin "ὥαεντ k2(22 — k*).cos 5 ee ὦ 
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To make further progress it now becomes necessary for us to settle the 
question of when a, Maclaurin or Taylor series is really equal to the function 
with which it is associated. Let the function f(x) be infinitely differentiable 
and have the Taylor series representation Eqn (12:17), and let Pn-i(x) be 
the sum of the first » terms of the series terminating at the power (x — xo)"~}, 
so that 


Pn-a(x) = feo) + Ce — xodflea) + FS” pee) τ 


— n—1 
= = | f'™-(X0). 


Then a necessary and sufficient condition that the Taylor series should. 
converge to f(x) is obviously that 


lim | f(x) — Pr-i(x) | = 0. 


This suggests that to establish convergence we must examine the behaviour 
of the remainder of the series after n terms. To achieve this we now prove 
Taylor's theorem, one form of which is stated below. 


THEOREM 12:10 (Taylor’s theorem with a remainder) Let f(x) be a function 
which is differentiable m times in the interval a< x < ὃ. Then there exists a 
number, é, strictly between a and 5, such that 


(ὁ — 


δ) =f@ + (6 —af'@ + ofa ay+° 


ore ~fir-ayq) + C= 9 


(τ -- ἢ) Ὁ. 


Proof The proof of Taylor’s theorem we now offer will be based on Rolle’s 
theorem. Let k be defined such that 


b — q)?-1 — gq) 
ΟῚ =fla+(b-—aOf'a+t-+-+ oa feta) i (ὁ ~ k, 
and define the function F(x) by the expression 
b— 
FO) =f) ες τ vf — Ns ~ mF κα Ὡ() 
(ὁ -- x)" 
al ᾿ 


Then F(b) = F(a) = 0, and a simple calculation shows that 


ΡΌ = SOAP (σωρὸ — ἢ 
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Since, by hypothesis, f(x) is differentiable in a<x< b, the function 
F(x) satisfies the conditions of Rolle’s theorem, which asserts that there must 
be a number 6, strictly between a and ὃ, for which F'(é) = 0. Asa < E< ῥ, 
the factor (ὁ — ξ)»-1 - 0, so that we must have k = f‘™(€). This completes 
the proof of Taylor’s theorem. 


If we identify ὁ with x and a with xo, Taylor’s theorem with a remainder 
takes the form 


Fs) = fsa) τὰ ταῦ τ + + FAO prayer 
εἰς em AO" 0), (12:18) 


where xp < & < x. For obvious reasons the last term of this expression 5 
called the remainder term and is usually denoted by Rp(x). The form stated 
here in which 


(x — 


Rr(x) = Sle, (oS), (12-19) 


with x9 < & < x is known as the Lagrange form of the remainder term. 
When xo = 0 Eqn (12-18) reduces to Maclaurin’s theorem with a Lagrange 
remainder, 


2 
F(x) = fO) + xf’) + =f") Re as 


i ay \ 1 (0) 


+f), (1220) 


whereO < €< x. 


Example 12:17 Find the Lagrange remainders Rp(x) after m terms in the 
Maclaurin series expansions of εὖ, sin x, and cos x. By showing that in each 
case R,(x) > 0 as n— οὐ, prove that these functions are equal to their — 
Maclaurin series expansions. 


Solution If f(x) = εὖ, it is easily shown that Eqn (12:20) takes the form 


x2. x3 χης 
SA op ΚΑ τα Soop 
where Rn(x) = (x"/n!)e®, and 0 < ξ < x. Now εὖ < εἰσ], and in connection 
with Eqn bess we proved that | 
lige 


<a Ἢ -- -- (4)”-R+1, 
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where R is an integer greater than 2x. Hence for any fixed x, εἰσὶ is a finite 
constant and x"/n! -- Ὁ as n— oo. It follows from this that R,»(x) > 0 as 
n—» οὐ. This provides an alternative verification of the results of Section 6-1. 

If f(x) = sin x, then the Maclaurin series with a Lagrange remainder 
Eqn (12-20) becomes 


ἜΝ χϑ xd etnies nv 
SUES τῶν τίω ἘΞ μα (e+), 


where 0 « ἐ < x. The Lagrange remainder Eqn (12-19) is the last term 


Th . n ; 
Κι (ΑἹ) = τ sin (: + =) 


Since | sin [ξ + (#7/2)] | < 1 we must have 
xn 
Ried |< [51] 
n! 
showing that R(x) — 0 as n — oo. This establishes the convergence of sin x 


to its Maclaurin series, and the argument for the cosine function is exactly 
similar. 


Example 12:18 Establish that log (1+ x) converges to its Maclaurin 
series in the interval —-1 << x< l. 


Solution The Maclaurin series with a remainder is (see Example 12-13) 


Oe ES Ξὸ a ae Fa Ra), 
where the Lagrange remainder is 
(— 1)5- χῆ 
Κι) = ------, 
a n(1 + &)" 


with € < x. For the interval 0 - x < 1, we must have 0 < & < 1 so that 
1+¢> 1, and hence (1 + "> 1. Thus | Ra(x)| < x"/in << 1/n—->0 as 
n—» οὐ, thereby proving convergence of the Maclaurin series to log (1 + x) 
forO<x< l. 

We must proceed differently to prove convergence for the interval 
-ὶ «χα «0. Set y= —x and consider the interval 0 < y < 1, in which 
we may write 


dt 


τ 


log (1 + x) = ορ (( -- y) = -- 
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Using the identity 


1 -- ἡ 

ΕΝ σέ πὰ 
ἱ --ῷ Ι --- 

we have, after integration, 

eine nie once cae τ ip ea 
log (1 — y) = oe = 


Thus our remainder term is now expressed in the form of the integral 


Ray) τὴν | ae 


Now, 


we getaie sees ήλίγξευνς: es 
πο π΄ < (75) bee =I aD 


1 
“ἃ το «ἢ 


so that | Ra(y) | > 0 as n > oo. This establishes convergence in the interval 
—1<x< 0. Taken together with the first result we have succeeded in 
showing that the Maclaurin series of log (1 + x) converges to the function 
itself in the interval —1 << x< 1. This provides the justification for our 
final result in Example 12-13. 

When performing numerical calculations with Taylor series, the remainder 
term provides information on the number of terms that must be retained in 
order to attain any specified accuracy. Suppose, for example, we wished to 
calculate sin 31° correct to five decimal places by means of Eqn (12-18). 
Then first we would need to set f(x) = sin x to obtain 


-- ee 
sin xX = sin Xo + (x — Xo) COS Xo — So AO sin xe nae 
xXx — Xo)! . Ἠπ 
so sale (x ἊΝ =) + Rn(x), 


where the remainder 
— n 
Rix) = ἘΞ ΤΡ sin (8 55) 
n! 2 


with x9 < ξ - χ. 

As the arguments of trigonometric functions must be specified in radian 
measure it is necessary to set x equal to the radian equivalent of 31° and then 
to choose a convenient value for xo. We have 31° is equivalent to 7/6 + 7/180 
radians, so that a convenient value for x9 would be xo = 7/6. This is, of 
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course, the radian equivalent of 30°. The remainder term Rp(x) now becomes 


n 1 ; 
Κι, ΑἹ = (=) Ἢ sin (: 4. =) 


whence 


| Ras) |< {τ} ὁ 
ΚΛ Ὶ ἘΞ ΑἼ80) πὶ 
For our desired accuracy we must have | Rn({x)| <5 x 10-8. Hence n 
must be such that 


ar \n | 
—} -—<5~x 106, 
(=) ni. 
A short calculation soon shows this condition 15 satisfied for n > 3, so that 
the expansion need only contain powers as far as (x — xo)?. 

The polynomial 
(x — χο)ῦ 1 
ome f "™(x0) 


Pn—-1(x) = f(x0) + (ὦ — Xo) Γ΄ (Χο) Ft + Ἢ 
(12:21) 


associated with Taylor’s theorem as expressed in Eqn (12-18) is called a 
Taylor polynomial of degree (n — 1) about the point x = xo. It is obviously 
an approximating polynomial for the function f(x) in the sense that | f(x) — 
Pn-1(x) | > 0 as n — oo for all x within the interval of convergence. Hence 
Pn-1(x) is strictly analogous to the nth partial sum used in the previous sec- 
tion. By way of example, the Taylor polynomial P3(x) for the exponential 
function εὖ about the point x = 0 is 

3 


x? x 
Ρῃ(χ) ΞΞ le xt ot ap 


whilst its general Taylor polynomial P,(x) about the point x = 0 15 


x2 
CS a τὰν a δου π, 


Example 12-19 Evaluate the integral 
08 ὦ 
[= | ee” dx 
0 


by approximating e-*" by its Taylor polynomial P2(x) about the point x = 0. 
Estimate the error involved in using this approximation. 


Solution Setting f(x) =e~* it is straightforward matter to show that 
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f0)=1, f'0)=90, , (0) = --2 and f(x) = 4x3 -- 2x2)e~”, Hence 
P(x) = 1 — x?, and by Taylor’s theorem with a remainder 


2 oS 
e® — P2(x) + τι} (Ὁ), 


where 0 < & < 0-2. 
Now we have 


0-2 0-2 
lx Ϊ P2(x)dx = | (1 — x2)dx = 0-1973, 
0 0 


which is our approximate value for the integral. To assess the error E we use 
the fact that 


0-2 Ἶ 0-2 
E= Ϊ ο΄ dx -- | Po(x)dx 
0 0 
0-2 : 
= Ϊ (ἡ — Ρη(α))άχ 
0 


0:2 x3 
=| 3 fd. 


In this expression, = &(x), because Ὁ < € < x and ~ is itself integrated over 
the ἱπίεγνα Ὁ < x < 0-2. Although the functional form of &(x) is unknown, 
we may obtain an overestimate of E by replacing f”(&) by its greatest value 
in the interval 0 << x < 0-2. Using the fact that f"(x) = 4x(3 — 2x*)e-? 
and max | f”(&) | = max,| f(x) | we estimate this latter quantity by assigning 
to each of the three factors in f”(x) its maximum value. We find that 


max | f”"(x) |< 08.3.1, 


whence 


4 [02 

ες | x3 dx = 0-0002. 

3! Jo 

In many books Theorem 12-10 is called the generalized mean value 
theorem, since when n = | it reduces to the already familiar mean value 
theorem derived in Chapter 5 (Theorem 5-12). Let us now derive the analogue 
of Taylor’s theorem with a remainder for a function of two variables. 

Suppose that f(x, y) has continuous partial derivatives up to those of 
nth order, and consider the function | 


F(t) = f(a + ht, ὃ + kt), (12:22) 


in which a, ὁ, h, and k are constants. Then F(t) = f(x, y), where x = a + 
ht, y = b + kt, and in the neighbourhood of (a, b) we have 
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df ofdx | of dy 
τ ~ dt oxdt ὃγνάΐ 
of of 
Write this result in the form 
df ὃ ὃ 
a= (tate 5) 


where the expression in parentheses is a partial differential operator with 
respect to x and y and is not a function. It only generates a function when it 
acts on a suitably differentiable function f/f; In consequence, differentiating 
r times, we have 


d\r a a\r 
(5)r- (1 = +ke)y forr=1,2,..., (12-23) 


with the understanding that: 
ὃ ὃ of δ 
με τας πη εκ, 


ΠΥ hs ae οὐ 


ax 


δ : Of : of of 
ae as 2 2 3 Ea ie 
(MS +ke) f= he = + 3, ae, + 3k ἢ 


Now F(0) = f(a, δ), FU) = f(a + h, ὃ +k), and F(?) is differentiable n 
times for 0 -Ξ t< 1. Consequently, by applying Theorem 12-10 to the 


function F(t) we obtain 


F(l) = FO) + FO) + σι FO) Ἐ + Fe) 


+ — Fm), (12:24) 


where0 < € < l. 
However, we also have 


1. 8 Q\r 
ΕΟ) (0) = (εξ τὰ =)7 


oy 2=a 
and ws 
Fm) = (ho Ἐκ} (1225) 
Ox oy r=-a+th 
y=b+ék 
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whence by substitution of Eqn (12-25) into Eqn (12-24) we obtain: 
fa +h, b + k) = f(a, δ) + hfsta, δ) + Κίν(α, δ) 
| 


7) 6 \? 1 0 0 \n-l 
ἌΝ = Se ἄξεις το οὐ ὅς ὁ ὅν © 
“τ (pa τας) 1 ΠΈΤ +o ( ae J DNs 
y=b y=b 
] oO δ 
ω n! (1 τος Ξ) f scatth a ἘΠ 
y=b+6ék 


where0 < € < I. 

This result is Taylor’s theorem for a function f(x, y) of two variables and 
it is terminated with a Lagrange remainder term involving nth partial de- 
rivatives. The result is also often known as the generalized mean value theorem 
for a function of two variables. In particular, by taking n = 1 we obtain the 
result 


fla+h,b +k) =fa, b) + hfcla + Eh, ὃ + Ek) + kfa + éh, ὃ + Eh), 
(12:27) 


where 0 < <1. This is the two variable analogue of Theorem 5-12 to 
which it obviously reduces when f= f(x), for then fy =0. Result Eqn 
(12:26) is of such importance that it merits stating in the form of a theorem. 


THEOREM 12-11 (generalized mean value theorem in two variables) Let 
J (x, y) have continuous partial derivatives up to those of order n in some 
neighbourhood of the point (a, b). Then if (x, y) is any point within this 
neighbourhood, 


δ ὃ l ὃ 
feo) πραγ (α πῶξι τ τὸ), Ἐπία - ὦ 


oy 


ao 5.’ 


Oo \2 
= υ ἕω =) 7 (a,b) 


R 5 3 
ΣΝ + Rn(x;, y) 


1 o o \n-l 
taop(@-9e+0-HE) 7 


where the Lagrange remainder 


Ι δ ὃ. \” 
κιρς) πα τῶξ ἐὺ τὸς 7 ν 


in which ἢ Ξξξ a+ &(x—a) C€=b+&y—b), andO<é<l., 


Example 12:20 Use the generalized mean value theorem in two variables 
to expand the function 
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Sf, y) Ξ ρα 


about the point (0, 0). Terminate the expansion with the Lagrange remainder 
term R3(x, y) and display its form. 


Solution As the expansion is required about the point (0,0) we must set 
a= 0, ὃ =0 in Theorem 12:11 and take n = 3. Routine calculation shows 
that: 
FO, 0) = I, FO, 0) = I, FO, 0) = 0, {ετί(, 0) = l, Sxy(O, 0) = 25 
Suy, 0) = 0, 


whilst 


μετα, νὴ) = (1 + 2y)3 er +220, 

Sray(X, y) = 20 + 2y)[2 + xl + 2y)] e**?2y, 
Suyxlx, y) = 4x[2 + x(1 + 2y)] e*+?y, 
ἵνννία, γ) — ὃχϑ eri2zy. 


From Theorem 12:11 we find 
ertery = 1 + x + 3x + 2xy + Ralx, y), 


where 


| 
R3(x, y) = 31 (x8frralx, y) τ 3x* yf, χανί(χ, y) ΠΗ 3xy*fyye(x, y) 


+ γῆ νννί,, Vine) 
with 7 = ἔχ, C= éy,and0< ξ <1. 


12-4 Application of Taylor’s theorem 


The applications of Taylor’s theorem with a remainder are so numerous that 
we can do no more here than describe some of the most common, It is hoped 
that these illustrations will indicate the power of this theorem and the fact 
that its use is not confined exclusively to the estimation of errors in the series 
expansion of functions. 


12:4(a) Indeterminate forms 


The form of L’Hospital’s rule given in Theorem 5-14 is capable of immediate 
extension as follows. 


THEOREM 12-12 (extended L’Hospital’s rule) Let f(x) and g(x) be n times 
differentiable functions which are such that f(a) = g(a) = 0 and f(a) = 
g(a) = 0 forr=1,2,...,2—1, but lim f(x) and lim g(x) are not 
both zero. eee ἘΠῚ 
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li (n) 
lim AG) = ao ia ᾿ 
aa 5(Χ) [πὶ g(x) 


Proof Using Taylor’s theorem with a remainder to expand numerator and 
denominator separately gives 


" f'™(é1) 


jn 
f@+hf'(at+-++t+ re 60 
το g ME) 


f(a +h) = 
g(a +h) 


g(a) t+hg'(a)+-° ++ Ὡ g'™(§2) 


wherea< &:<atha<&<a+th. If nowh—0O, then &1, ξὰ --» ἃ and 
we obtain the result of the theorem 


jad: Ὁ 


no (a+ ἢ limg™(x) 
Ta 


Example 12-21 Find the value of the expression 

lim x sin x 

2x0 (a* — 1)(6% — 1) 
Solution This is an indeterminate form. Setting f(x) = xsin x, g(x) = 
(at — 1)(δ5 — 1), we first compute f’(x) and g’(x). We find f(x) = sin x 
+xcosx and g(x) = αὐ log a(b* — 1) + δή log b(a* — 1), and clearly 
lim f’(x) = lim g’(x) = 0. The earlier form of L’Hospital’s rule thus fails, 
“-ῷ x—0 


and we must make appeal to Theorem 12-12 and compute f"(x) and σ΄"). 
We find f’(x)=2cosx—xsinx and = g"(x) = 2a%b* logalogb + 
a*(log a)*(b™ — 1) + b*(log b)?(a* — 1), from which we see that lim f"(x) = 2, 


x—0 
lim g"(x) = 2 log a log δ. By the conditions of Theorem 12-12 we have 
“--ῷ 


lim x sin x 5" 1 
720 (αὖ —1\(6%—1) logalogb 


12-4(b) Local behaviour of functions of one variable 


In Chapter 5 we repeatedly turned to the problem of the local behaviour of a 
function of one variable in order to identify local maxima, local minima, and 
points of inflection. Here again Taylor’s theorem with a remainder helps to 
identify such points when not only the first derivative, but also successive 
higher order derivatives vanish at a point. 
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Suppose that f(x) is n times differentiable near x = a and that f(a) = 
fa) =: ++ =f" Va) =0, but that f(a) ~0. Then by Taylor’s 
theorem 

μπι-1 
(n — 1)! 
where a< ¢<a-+A, but because of the vanishing of the first (” — 1) 
derivatives at x = a this simplifies to 


fa@th=f@thfY@+—0-4+ 


f'™Va) + — f'™), 


hin 
fla+h—f@= as). 


The behaviour of the left-hand side of this expression was used in Chapter 
5 to identify the nature of the extrema involved so that we see its sign 15 now 
determined solely by the sign of h"f'™(€) or, for suitably small ἢ, by the sign 
of h7f'™(a). It is left to the reader to verify that the following theorem is an 
immediate consequence of this simple result when taken in conjunction with 
Definition 5-4. 


THEOREM 12-13 (identification of local extrema—one independent variable) 
A necessary and sufficient condition that a suitably differentiable function 


f(x) have a local eoneer at x = a is that the first derivative f(x) with 
minimum 

f@™@ <0 

f(m(a) > Of If the 

first derivative other than f(a) with a non-zero value at x = α is of odd 

order, then f(x) has a point of inflection with an associated zero gradient 


at x = a. 


a non-zero value at x = a shall be of even order and | 


12:4(c) Error estimate for Simpson’s rule 
In Chapter 7 it was shown that if 


roth 
I= [ f(x)dx, 
ro—hA 
then Simpson’s rule for the approximate calculation of J was 
h 
Ϊ Ὁ Ξ (f(xo -- h) + 4f(xo) + f(xo + A)). 
The error E(A) is a function of the interval length A and by definition 
h toth 
E(h) = 3 Leo — h) + 4f(xo) + f(xo + A)) — { Ἶ [|(ὐ)άχ. 
το-- 


Differentiating with respect to A and using Theorem 7:8 to differentiate 
the integral which is a function of its upper limit gives 
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h 
E'(i) = τίου — I) + Af (xo) + floro + I) +5 (=f (eo — ἢ + 


Ὁ + A) — f(xo + A) + f(x — A), 
whence £'(0) = 0. 


Differentiating again yields 
h I 
Ε΄“ δ = 3 PO +h) + f"(xo —A)) + 3 fo —h) —f'(xo + A)), 


whence Ε΄ (0) = 0. 
Finally, one further differentiation gives 


h 
Ε΄ (Δ) = τὶ [Ἃἀο- ἢ) -- f’" (Xo — A)). 


Now setting » = | in Taylor’s theorem with a remainder and applying 
it to the function f”(x) on the interval x9 —-h<ix< x0 +/A gives 


f'"(xo + h) =f" (xo ΡΟ h) + 2hf *(), 
where x9 - ἢ - ᾧᾷ < x9 +h. Using this result in E”(h) shows that 


E"(h) =f), 
Now 
{ " EM)dt = E"(h) — E"0) = E"(h, 
0 


so that assigning to | f(&) | the maximum value M of | f(x) | in xo — A 
<I x< xo + A it follows that 


22 2h? M 
E*Xh<=M | —dt= 
3 9 
A further integration using the fact that £’(0) = 0 gives 
213 M | h4M 
E'(thy< =——> 
= {> 18 
after which one final ἜΤ using the obvious fact that Ε(0) = 0 yields 


h AM hiM 
EW) < | 73 = 0° 


This is our desired error estimate, and as M = max | f(x) | for xo — ἢ 
<x< xo +A, it shows that contrary to expectation, Simpson’s rule is 
exact for any polynomial up to and including degree 3. This result is sur- 
prising because Simpson’s rule was based on the fitting of a quadratic at three 
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equally spaced points. 
Suppose for example that we desired to calculate 


har 
1=| sin x dx 
0 


using Simpson’s rule with only three points. Then f(x) = sin x and f(x) 
= sin x, so that if M = max | f(x)| for O< x < ἔπ, then M = 1/4/2. 
We have h = ἐπ, so that the error incurred 


E(4ar) < (Arr) = 7:3 x 10°. 


I 
904/2 


12-4 (Δ) Newton’s method 


Newton’s method is a simple and powerful method for the accurate deter- 
mination of the roots of an equation f(x) = 0, and is based on Taylor’s 
theorem with the Lagrange remainder Ro(.x). 

Suppose xo is an approximate root of f(x) = 0 and A is such that x = 
xo + his an exact root. Then by Taylor’s theorem 


flew +h) = flo) + if'eo) + Ὁ, 


where x9 <E < xo +A. 
As, by supposition, f(xo + 1) = 0 we find 


h2 
0 = f(xo) + ἡ (Χο) + 5.70. 


Now & is not known, but on the assumption that h is small we may define ἃ 
first approximation h, to ἢ by neglecting the third term and writing 


ΒΟΥ 

I (Xo) 
The next approximation to the root itself must be x, = xo + 41, whence by 
the same argument, the approximation Az to the correction needed to make 
x1 an exact root is 
— f(xo +h) 

7 (Χο + An) 
Proceeding in this manner we find that the nth approximation x, to the exact 
root of f(x) = 0 is, in terms of the (7 — 1)th approximation xy-1, 
I (xn-1) 
f' (&n-1) 


The successive calculation of improved approximations in this manner is 


h= 


hg = 


Xn = Xn-1 τ’ 
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called iteration, and x, itself is called the nth /terate. 

If the sequence {x,} tends to a limit x*, it follows that this limit must be 
the desired root, for then the numerator of the correction term vanishes. The 
choice of an approximate root xo with which to start the process may be 
made in any convenient manner. The most usual method Is to seek to show 
that the root lies between two fairly close values x = a, x = 6 and then to 
take for xo any value that is intermediate between them. The numbers a, ὃ 
are usually found by direct calculation, which is used to prove that f(a) and 
f(b) are of opposite sign, so that by the intermediate value theorem a zero of 
y = f(x) must occur in the interval a < x < b. 

The reasons for both the success and failure of Newton’s method are 
best appreciated in geometrical terms. The calculation of x, from xn-1 
amounts to tracing back the tangent to the curve y = f(x) at xn-1 until it 
intersects the x-axis at the point xp. If xn lies between x,-1 and x* for all ἡ 
then the process converges; otherwise it diverges. Fig. 12-3 (a) illustrates a 
convergent iteration and Fig. 12-3 (b) a divergent one. 


(b) 


Fig. 12:3 (a) Convergent Newton iteration process; (b) divergent Newton iteration 
process. 


Example 12-22 Locate the real root of the cubic 
x8+>x2+2x4+1=0. 


Use the result to find the remaining roots. 


Solution Setting f(x) = x? + x2 + 2x + 1 we see that f(0) = 1 > 0 and 
f(—1) = —1 <0, so that by the intermediate value theorem a root of the 
equation f(x) = 0 must lie in the interval —1 < x <0. Take x9 = —0°5, 
since this lies within the desired interval. 

Now /f'(x) = 3x2 + 2x + 2 so that Newton’s method requires us to 
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employ the relation 
Xn-12 + χη + 2xn-1 + | 
3Xn—1% + 2Xn-1 + 2 


Ny = Nan-1 π 


starting with xg = —0-5. 


A straightforward calculation shows that to four decimal places x1 = 
—0:5714, xe = —0-5698, and x3 = —0-5698. The iteration process has thus 
converged to within the required accuracy in only three iterations. The real 
root 15 x* = —0-5698, and the remaining two roots can now be found by 
dividing f(x) = 0 by the factor (x + 0-5698) and then solving the remaining 
quadratic in the usual manner. If this is done, long division gives 


x3 + x2 4 2. Ι 
shee τ ae ee, = x2 + 0-4302x + 1-7549, 


x + 0-5698 
from which we find the other two roots are 
x = —0-2151 + 71-3071 and = —0-2151 — 71-3071. 


12.585 Applications of the generalized mean value 
theorem 


The applications of the extension of Taylor’s theorem to functions of two or 
more variables are perhaps even more extensive than those of Taylor’s 
theorem itself. This section illustrates a few of the simplest and most used, 
connected mainly with functions of two variables. The final application, 
connected with the least squares fitting of a polynomial, is the only one con- 
cerning functions of more than two variables. 


12,5 (4) Stationary points of functions of two variables 


Consider the function z = f(x, y) of the two real independent variables x, y 
which is defined in some region D of the (x, y)-plane bounded by the curve vy. 
The notion of its graph is already familiar to us and it comprises a surface S 
with points (x, y, f(x, y)), the projection of the boundary I’ of which onto 
the (x, y)-plane is the curve y. A typical situation is shown in Fig. 12-4 (a, b) 
where the point P is obviously a maximum and the point Q is obviously a 
minimum. 

Intuitively, and by analogy with the single variable case, it would seem 
that all that is necessary to locate extrema such as P, Q is to find those points 
(xo, yo) at which f2(xo, yo) = fy(xo, yo) = 0. This is, in effect, saying that the 
tangent plane at either a maximum or a minimum must be parallel to the 
(x, y)-plane. Unfortunately, this is not a sufficiently stringent condition, for 
the point R in Fig. 12-5 is neither a maximum, nor a minimum, yet the tangent 
plane at that point is certainly parallel to the (x, y)-plane. Because of the 
shape of the surface it is called a saddle point. It is characterized by the fact 
that if the surface is sectioned through R by different planes parallel to the 


Fig. 12-4 (a) Surface having maximum at P; (Ὁ) surface having minimum at Ὁ. 


z-axis, then for some the curve of section has a minimum at R and for others 
a maximum. 

Each of these points P, Q, R is called a stationary point of the function 
z = f(x, y) because f; and f, vanish at these points. 


DEFINITION 12:3 (stationary points of f(x, y)) Let (x, y) be a differenti- 
able function in some region of the (x, y)-plane. Then any point (xo, yo) in 
D for which fz(xo, vo) = 0 and f,(xo, vo) = 0 is called a stationary point of 
the function f(x, y) in D. 

If for all (x, y) near (x0, yo) it is true that f(x, y) < f(x, yo), then f(x, y) 
will be said to have a local maximum at (xo, yo). If for all (x, y) near to (xo, yo) 
it is true that f(x, y) > Χ(χο, yo), then f(x, y) will be said to have a Jocal 
minimum at (Xo, yo). In the event that f(x, y) assumes values both greater 
and less than f(xo, yo) for (x, y) near to a stationary point (xo, yo), then 
I(x, y) will be said to have a saddle point at (xo, Vo). 


We now use the generalized mean value theorem to prove the following 
result. 


THEOREM 12:14 (identification of extrema of f(x, y)) Let f(x, y) be a func- 
tion with continuous first and second order partial derivatives. Then a 


on for f(x, y) is that: 


sufficient condition that (xo, yo) is a local a 
minimum 
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z = f(x, y) 


J 


Fig. 125. Saddle point. 


(a) felxo, vo) = fy(Xo, Yo) = 0; 
(b) fra(xo, Vo) fyy(xo, Vo) > fry?(Xo, Vo); 
firx(X0, Yo) < 9 
: Sux(xXo, Yo) > 0. 
A sufficient condition that f(x, y) should have a saddle point at (xo, yo) is 
that in addition to condition (a) above being satisfied, it is also true that: 
(4) frx(xo, Yo)fuy(xo, Yo) < ων" (Χο, Vo). 


Proof Note first that (Ὁ) implies either that fzz(vo, vo) > 0 and fyy(xo, yo) 
> 0 or that fer(xo, vo) < 0 and fyy(xo, vo) < 0. Consider the case frz(xo, Vo) 
> 0. Then by the generalized mean value theorem with n = 2, 
7 ὁ +A, yo + ΚΑ) — f(X0, vo) = Bh? fex(y, ἢ + 2hkfay(y, Ὁ 
+ k*fy ly, 2], 


where ἢ = xo + Sh, € = yo + ἐκ withO < ξ - 1. Nowas fu, fey, and fyy 
are assumed continuous, it follows from (Ὁ) that for sufficiently small ἢ and 


k, ξετί(ξ, ηγίννίξ, ἢ) — ξεν" (ξ, 4) > 0. Thus we have 
S(x0 + h, yo + k) — f(X0; yo) = HAA? + 2Bhk + Ck?), 
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where A = fr2(y, ἢ, B= fry(y, ἢ, C = fyy(y, 4). Completing the square 
on the right-hand side of this equation allows us to write it as 


B;\* AC — B 
[Χο +h, yo + k) — f(X0, yo) = 4A { + 11) ΞΕ (“=| ke}. 


Clearly, (h + (B/A)k)* >0O and [(AC — B?)/A?]k?2 > 0 if k > 0 since, 
by hypothesis, AC — B? > 0. In the event k = 0, then Ah? + 2Bhk + Ck? 
= Ah* > 0 provided ἢ 40. 

Thus, if not both h and k = 0, since we are assuming A > 0 we have 
shown that 


f(xo +A, vo + k) — f (xo, yo) > 0 


for small A, & or, equivalently, 


Κα, γ) > f(xo, yo), 


for all (x, y) near (Xo, yo). This is the condition that f(x, y) should have a 
local minimum at (Χο, yo). 

The verification of the condition for a local maximum at (xo, yo) follows 
from the above argument by setting g(x, y) = —/f(x, y) and then supposing 
that fz2(xo, yo) < 0. This establishes that g(x, y) has a local minimum at 
(xo, Yo) so that f(x, y) must have a local maximum at that point. 

The verification of the condition for a saddle point follows directly from 
consideration of the result 


floc + hy yo + ἢ — fo, yo) = $A { ie sk) + (| ka] 


which was derived above. For now, by hypothesis, AC — ΒΞ < 0, so that the 
terms within the large bracket are of opposite signs. This implies that 
S(xo + A, yo + k) — f(Xo, yo) can be made either positive or negative near 
(xo, yo) by a suitable choice of A, k. This is the condition for a saddle point 
and completes the proof of the theorem. 


Example 12:23 Find the stationary points of the function 
T(x, y) = 2x8 — 9x*y + 12χγ3 — 60y 


and identify their nature. 


Solution We have, 
fc = 6x* — 18xyp 4+ 12y? and fy = —9x? + 24xy — 60. 
The conditions fz = fy = 0 are equivalent to 


(Ξ, Ξ 0) (x— yx -- 2») =0 
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and 
(fy = 0) 3x? — 8xy + 20 = 0. 


From the first condition we may either have x = y or x = 2y. Substituting 
x = yin the second condition gives rise to the equation y* = 4, so that the 
stationary points corresponding to x = y are (2,2) and (—2, —2). Sub- 
stituting x = 2y in the second condition gives rise to the condition y* = 5, 
so that the stationary points corresponding to x = 2y are (2/5, +/5) and 
(—2V/5, -- ν 5). 

There are thus four stationary points associated with the function in 
question and we must apply the tests given in Theorem 12-14 to identify their 
nature. We have 


Suz = 12x — L8y, {εν = —18x + 24y, ἵνν = 24x, 


and it is easily verified that fozfyy — fry? <0 at both of the points (2, 2) 
and (--2, —2), showing that they must be saddle points. A similar calculation 
shows that frzfyy — fry? > 0 at each of the other stationary points, though 
fax > 0 at (24/5, +/5), showing that it must be a minimum, whereas fz < 0 
at (—24/5, —+/5), showing that it must be a maximum. 


12:5 (0) Constrained extrema 


A slightly more difficult problem involving the location of the extrema of a 
function z = f(x, y) of two variables occurs when the points (x, y) are con- 
strained to lie on some curve g(x, y) = 0. This is illustrated in Fig. 12-6, in 


z= fxs y) Ὁ om 


g(x, y) =0 


Fig. 12-6 Constrained extrema. 
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which f(x, y) is defined at points in the region D of the (x, y)-plane contained 
within the curve y defined by g(x, y) = 0. The boundary points on the surface 
z = f(x, y) corresponding to the boundary curve γ of D form the closed space 
curve I’. Our task is to locate the maximum and minimum values P and Q 
assumed by z = f(x, y) on the curve I’. These correspond to the points P’ 
and Q’ on y. : 

In principle this is a problem of locating the extrema for a function of one 
variable, because solving g(x, y) = 0 explicitly for y in the form y = A(x) 
shows that we must find and identify the stationary points of z = F(x), 
where F(x) = f(x, A(x)). However, this is usually an impossible task because 
g(x, y) = 0 cannot, as a general rule, be solved explicitly for y. Instead we 
proceed as follows. 


We have 

z = f(x,y) (12-28) 
and 

g(x, y) = 0, (12-29) 


so that forming the total derivatives of these with respect to x gives 


dz _ of δ ἀν 


ic oe oy da (12-30) 
and 
_ 7 ee (12.31) 
Thus on y we have, provided ég/dy #~ 0, 
2 gg 1 
dx Ox] ey 


whence on y Eqn (12:30) becomes 
aa (s)ed/ (5) 
dx ὃχ = \ay]\ax]/ \ay 


As already remarked, on y the function z = f(x, y) is effectively only a func- 
tion of x, so that its stationary points will be determined by the condition 


dz/dx = 0. 
Thus the solution to our problem lies in solving the equation 
of ὃ of ὃ 
f og Ὁ & _ ὁ (12-32) 
Ox dy oy Ox 


subject to the constraint condition 


z(x,y) = 0. (12-33) 
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Algebraically, this is equivalent to determining the value of the parameter 
A in order that the function of two variables 


w = f(x, y) + Ag(x, y) (12:34) 


should have a stationary point subject to the constraint condition Eqn 
(12-33). This follows because for w to have a stationary point we need both 
Wr = fx + Age = 0 and wy = fy + λὲν = 0, and these homogeneous equa- 
tions have a solution only when condition (12-32) is satisfied. The parameter A 
occurring in Eqn (12:34) is called a Lagrange multiplier. Solving these 
equations locates the stationary points but does not identify their nature. 
This must be undertaken by an examination of the conditions in the neigh- 
bourhood of the stationary points and possibly, as in the following example, 
by other considerations implicit in the problem. 


Example 12:24 Determine the dimensions of the rectangle of maximum 
area whose sides are parallel to the coordinate axes and whose corners are 
constrained to lie on the ellipse x? + 2y2 = 1. 


Solution From the symmetry of the ellipse it follows that if (x, y) is a point 
on x” + 2y? = 1, then the rectangle having a corner at this point must have 
a side of length 2x parallel to the x-axis and a side of length 2y parallel to the 
y-axis. The area z of the rectangle is thus z = 4xy and is, by definition, posi- 
tive. The constraint condition corresponding to g(x, y) = 0 is x2 + 2γ2 — 
1 = 0. So, making the identifications f(x, y) = 4xy and g(x, y) = x2 + 2y? 
— 1, we next form the function 


w= 4xy + A(x? + 2y? — 1) 


corresponding to Eqn (12-34). 
We have we = 4y + 2Ax and wy = 4x + 4/y, and as the stationary 
points of w occur when wz = wy = 0, this is equivalent to requiring 


Ax+2y=0 jgand x+Ay=0. 


For these homogeneous equations to have a solution, the determinant of 
their coefficients must vanish, giving rise to the condition 


A 2 
Ι A 


Hence 2 = ++/2. When 4 = 4/2 we have x + /2y = 0, and as this is 
subject to the constraint condition x? + 2y? = 1, it follows that the two 
possible solutions are (—1/4/2, 1/2), (1/+/2, —1/2). When 4 = —4/2 we 
have x — 4/2y = 0 and the same reasoning leads to the two other solutions 
(1//2, 1/2), (—1/4/2, —1/2). The extrema of z = 4xy on the curve x? + 2y? 
= 1 thus occur at the four stated points. As both the area of the rectangle 


= 0 or AA—2=0. 
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and the lengths of its sides must be positive, the only solution we may accept 
as being physically real is (1/+/2, 1/2), for this implies sides of length /2 
and 1. 


12:5 (c) Stationary points of functions of several variables 
By direct analogy with the concept of stationary points of a differentiable 
function of two variables, the stationary points of a differentiable function 


w = f(x1, X2,. . .» Xn) of the nm independent variables x1, x2, . . ., Xn are 
defined to be those points at which 

of of of 

Ox, OXe OXn ( ) 


The concepts of maxima and minima also extend in an obvious manner, 
for if P is a stationary point of w = f(x1, x2,. . ., Xn) and Q ts a neigh- 
bouring point, we shall say that P is a local minimum of fif f(Q) — f(P) > 0 
for all points Q in the neighbourhood of P. Similarly, we shall say that P is a 
local maximum of fif f(Q) — ΚΡ) < 0 for all points Q in the neighbourhood 
of P. 

We offer no further discussion of these matters aside from their applica- 
tion to the special problem of polynomial fitting by /east squares. This is the 
name given to the process whereby a polynomial of given degree m 


Y(x) = co + ὡχ + οὐχ + + + + Cmx™ 


with unknown coefficients co, ¢1,. . .» Cm is fitted to n pairs of points (x1, y1), 
(x2, y2),. . + (Xn, Yn) with n > m. The fitting is carried out in such a manner 


that the sum of the squares of the differences > (Y(xr) — yr)? is minimized. 
r=] 


In graphical terms this amounts to obtaining the best fit in the least squares 
sense of a polynomial curve of degree m to a set of m points which are con- 
nected by an unknown functional relationship. This process is of importance 
in statistics when the points usually represent the result of the measurement of 
determinate quantities which have random errors associated with them. 


Our task is to minimize the sum E(co, ¢1,. . ., Cm) of the squares of the 
errors at the known points, where 


% " 
E(co, C1, . » ., Cm) = > (co + cixr + Coxe? + "τ" + CmxXr™ — yr)?. 
r=1 


The square error E(co, ci,. . ., Cm) 15. a differentiable function of the un- 
known quantities co, ¢1,. . ., Cm, Which we shall now regard in the role of 
variables. We must determine them so that E(co, c1,. . ., Cm) 15. minimized. 
From our earlier remarks we see that E(co, ¢1,. . ., Cm) will have a stationary 
value if 
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We must thus solve these (m + 1) simultaneous equations for co, c1,. . ., 
Cm. Performing the indicated differentiation in the general case we find | 


OF us 


ἘΠῚ Ξ > 2(Co + CiXr + Coxe? ++ + + CmXr™ — Yr)X7?, 

OCp r=1 
forp =0,1,. . .,m. Hence the numbers co, C1,+ . +» Cm Must be obtained by 
solving the (m + 1) simultaneous equations 

᾿ | 

Σ (co + CiXr + CoxXr2 + on ae Ἔ CmxXr™ — Vr)XrP = 0. (12-36) 

r=! 
for p = 0, 1,. . ., m. When matters are well behaved there is only one solu- 
tion to this set of equations, and as E(co, c1,. . ., Cm) is essentially positive 


it is not difficult to verify that the corresponding solution ¥(x) minimizes 
E co, Clea! we Ss Cm). 


Example 12:25 Fit by least squares the polynomial Y = co + cix to the 
four points (0, 0-2), (1, 1-1), (2, 1-8), (3, 3-2). 


Solution In the notation of Eqn (12:36) we have m = 1 andn = 4, so that 
we must solve the two simultaneous equations 


4 4 
> (co — yr) +> C1Xr = 0 
v=] 


r=] 


and 
4 4 
Σ (co — yr)xr + > cixr? = 0, 
r=1 r=1 
or, equivalently, 
4 4 
4co + οἱ > xr => yr 
r=1 r=] 
and 
4 4 4 
CO > Xr + C1 > Xr? => XrVr. 
r=1 r=] r=1 
Now 
4 4 4 4 
> Xr = 6, Σ Xr? = 14, > yr = 6.3, and > Xryr = 14-3, 
r=l1 r=1 r=1 r=1 
from which it follows that the solution to the equations is co = 0-12 and 


οι = 0-97. The required straight line fitted by least squares thus has the 
equation 


Y = 0:12 + 0-97x. 
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PROBLEMS 


Section 12:1 
121 Write down the general term in each of the following series: 


3.5 7 
(8) ἘΣ 8 ; 
2 4 6. 
Oetetotyt ; 
1.3. 1.3.5. 1.3.5.7 
Olt+rgt+padtia.7s0t 


IL) <.f 1 1 1 I 

(ὦ 9 ἘΦ Ἐπ Ὑ 2 Ὁ 55 38 Ὁ ; 
1 1 I 1 

(Ὁ) Ἐς Ἐπ 


12:2 The series a + ar + αὐϑ-Ἔ "τὸ ar"+-- - is called either the geometric 
progression or the geometric series with initial term a and common ratio r. 
Denote by S, the sum of its first n terms so that 


1 
Sn = > ar™, 
m=0 
By considering the difference Sn — rSn prove that 
1 - γῆ 
Sn =a ( 1 — ) 
If r < 1 deduce that 
Σ ar” = a 
m=0 απο 


What is the remainder ἐμ of the series after n terms. 


12:3 Sum the following infinite series and find their remainders after ἢ terms: 


DN sae De - ἢ οὐδόν ll 
2 — — — Ὁ ee @ * 
(). 2 ΕἸ Ἔχ τς Ἔχε τοῖν 2 Ὁ τον ν 
2: Be Φ,. 1»... 
2D ag ee ga eg Egg ag 


12:4 State which of the following series is divergent by Theorem 12:2: 


(a) (1-3) + (t+) +(1-p) + oe ἘΠῚ oo ba fs 
4 9 n? 


ὌΝ Γ᾽ 3 


᾿] 22 325 μ3 
(Ὁ τ + σὲ ΒΩ τ᾿ ἘΩ͂Ν ue ΣΝ. 
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(d) l+rt+rt:: ie oe ian 
ὌΝ 
ν2 V3 νη 
12.5 Prove the divergence of the harmonic series by obtaining simple under- 
estimates for the sums of each of the indicated groupings of its terms and 
showing that they themselves form a series which is obviously divergent. 


fe) 1+ 


πο Ara od oe he eee μὰ ee ag 
e+ “με-.---υ .Ἠ..ς.---’ ΕΝ ------ν------ -- 


2 terms 4 terms 8 terms 


12:6 Use the comparison test to classify the following series as convergent or 
divergent: 


(a) 1 +8) 4:)* + Gh “τ Fees 


Oates - 


n? + 3 
| I l 1 
()}1 Ἐπ Ἐπ τὰ “ΞΕ ΤΙ ὦ : 
I Ι 1 1 
en ee τοῦ Wg 


| 12:7 Use the integral test to determine the convergence or divergence of the 
following series: 


Eee a eee eee ge ee ere ee ee ee 
2log2 3log3 4log4 n log n ᾿ 

ΠΤ ΞΕ meee ore : + 
2log?2 3log?3 4 log?4 n log? n 


Where appropriate, estimate the remainder after six terms. 


12:8 Classify the following convergent series as conditionally convergent or 
absolutely convergent: _ 


@ 1-+5-: ; ἐς. es 
ΩΣ 
Ol-gta-at: + (DM + 3 

12.9 Test the following series for convergence by the ratio test: 
i+ et Pepe BR Oe, 
tet yt SG, 
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2 5 10 n2 + ] 
Ost atot 3n2 4+ 2 > 
930 43,80 730 
aa νυ τευ δεν rs 
=. Ὑἢ 2! 3! πὶ 
©) τὸ ὁ τοῦ ἡ τοῦ t + τον τ 


Where appropriate, estimate the remainder after four terms. 


12:10 Test the following series for convergence by the nth root test: 


4\2 6\3 8 \4 2n n 
eee pees πο ee τς ae — 1\n(n—-1)/2 Saar τῷ 
@ 1-(3) - ἢ + (7) nes πὼς (noms | 
2. (33 /(4\5 ne 1\2n-1 . 
m5+ (3) +(a) t+ (Gea) ot 


1 2 \» 3 \? n Δ" 
Oret (ra are ετττα [Π3}} τ 


Where appropriate, estimate the remainder after five terms. 


12-11 Test the following alternating series for convergence: 


1 1 1 1 1 
———-_ <a —_——> =< . . . — n+l Φ « . * 
] ] Ϊ ] 
ma “Ὁ «5... — Ὁ . « ---- +1 -----ἝἜ ὠ-.ὄ.ς .-ε:-" 
ἰὼ ν 5 δι 75. 475 ἜΡΟΝ mist : 
1 1{Π|8Ὸ 1/1\8 1 /i\" 
--ὀ «τ΄ =m — — — ~— *¢ * . — n+1 — — ‘ 


Where appropriate, estimate the remainder after ten terms. 


12:12 Indicate the fault in one of the following contradictory arguments. 


< 1 
Let Sr = =e 
i Σ r(r + 2) 


the S =>) sea ἘΦ 
nen = Ne rE 2)} 
and for 2 = 2m + 1 (nm — odd) 
ΤΕ ΡΟ ΚΒ 
oe. A 2(m + 2) 


whilst for ἢ = 2m (n-even) 


“4 Wm+i1) 2m + 2) 
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Hence for all n it is true that 


S = lim Sp = }. 


n—> C 


However, alternatively, 


n+2_ n+4 ao m+2 Oo m+2 5 
S = + i απ —— ἘΞ Φ 
-Σ (ὦ 4n  A(n Ta) i+? 2 m 2 


12:13 The ratio test as expressed in Theorem 12:5 will fail if the ratio | an+1/an | 
does not tend to a limit. Examine the proof of this theorem and show that 
Lan will still converge even if | dn+i1/an| has no limit, provided only that 
| @n+i1/an | <L for all m, where L < 1. Show that the series will diverge if 
| @ni1/an | > 1 for all μ. This result is called the extended ratio test. Use the 
result to prove the convergence of the series 


5 7 5 6 + (—1)" 
sa a ΤΌΘ Ss ees ὧς τῷ ὙΠῸ: Ὁ Ὁ 5 τ 
6 36 162 ἐὰν δ ( 2.35 ) 


and show that the extended ratio test fails for the series 


1 1 1 1 1 1 


τ ΤΣ ae τ τες 


12:14 The nth root test as expressed in Theorem 12:6 will fail if the ratio | an+1/an | 
does not tend to a limit. Examine the proof of this theorem and show that 
Lan will still converge even if | @n+1/an| has no limit, provided only that 
| ἀπε απ | <L for all » where L < 1. Show that the series will diverge if 
| anaifan | > 1 for all π. This result is called the extended nth root test. Use 
the result to prove the convergence of the last series in Problem 12-13. Notice 
that the extended nth root test is a stronger test than the extended ratio test. 


Section 12:2 
12:15 Find the interval of convergence of each of the following power series: 


2! 3! n! 
2 aT — Tt « # oe * 
(ῶ χ Ἐ πε χὴ Ἐ yet: eae ἘΕ ; 


( 1)2 12 
(21)? Pert) ste a Uo re sc aN cere 


Oe 6! (Qn)! 


1+5 προ Os acd es ae 
(c) 1+ tr, tar Ft + arr a ; 


n 
(Ὁ 1 +7 +545 τὰ op pees 


+- 


ἈΠ ΕἼ 7 Me IS ΧΈΡΙ Ὥδ ας ΕἾ ea De 
(Oar: 7.9 + 3.93 7 n.9n 


12:16 Prove that if the power series > an(x — xo)" is convergent by the nth root 
n=0 


test, then its radius of convergence r is given by the expression 
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12:17 


12:18 


12°19 


12-20 


r = lim 


noo "/ ia n/ \an | 


Using this method to determine the radius of convergence, give examples of 
power series with: 


(a) an infinite radius of convergence; 
(b) a finite radius of convergence; 
(c) a zero radius of convergence. 


The following are examples of power series in which not all powers of x are 
present. Determine their intervals of convergence by means of a direct 
application of either the ratio test or the nth root test to the terms of the 
series. 


οὺ 
(a) Σ᾽ δ" χη, 
n=0 


-- ὅ( + 2)5 χη, 
ΟΣ ἘΞ 


co 3n—-1y2n-1 


ΟΣ Gp 
ο 
Let the power series S(x) = > anx" be such that for —r < x < r, a sequence 
n=0 
| 
{Mn} of constants exists with the property that > Mn is convergent. Use the 
n=0 
οο 
comparison test to prove that the convergence of δ᾽ Mn implies the absolute 
=0 
ἐς 12 
convergence of Σ anx” for all x in (—r, r). 
n=0 


οο 
The convergent series S(x) = > anx” will obviously be continuous for x 
n=0 
within its interval of convergence —r << x <r if | S(x + ἢ) — Sx) | > 0 
as h --» 0. Use the mean value theorem for derivatives to justify writing 


| δὰ + A) — SQ) | <|h| 2D mlan|.| |e 
n= 


where x < ξῃ «- x +h forn = 1, 2,.... Show that the series on the right- 
hand side is convergent and that as h— 0, so its radius of convergence 
approaches the value r. Hence deduce that | S(x + 1) -- S(x) | > Ὁ δβ ἡ -- 0. 


Find the radius of convergence of each of the following power series and 
verify that the differentiated series has the same radius of convergence: 

~~ (ἡ - 3)" 
(8) ΡΝ (2n + 1)25 + 12’ 


= (—1)"(x — 2)» 
Ὁ > Gn + 3) +1)’ 


(x + 9)” 
© > iat 1)? 
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12:21 Find the power series representation of arcsin x by considering the integral 


or is dx 
arcsin x = ᾿ τα say 


and find its radius of convergence. Does the power series converge at the end 
points of its interval of convergence. 


12-22 Find a series representation for the elliptic integral 
ἐπ 
| VU — k? sin? φ)αφ. 
0 


For what values of & 15 the resulting series convergent. 


12:23 By using the power series representation for sin x, find a series representation 
for the definite integral 


sin x 
—— dx. 


0 xX 


For what values of c is the resulting series convergent. 


12:24 By using the power series representation for log (1 — x), find a power series 
representation for the definite integral 


[ log (1 — x) dx. 
0 


x 


: For what values of x is the resulting power series convergent. 


Section 12:3 


12-25 Derive the Maclaurin series expansion of each of the following functions 
together with its interval of convergence. 


(a) x e727, 
(b) cosh (x?/2). 
(c) (1 + e%)?; 
| 
(d) 1+ x— 2x2 
(6) log [x + V(1 + x?)]. 
12:26 Write down the first three non-zero terms in the Maclaurin series expansion 
of 
(a) arcsinh x; 
(b) x cot x. 


(Hint: Use partial fractions.] 


12:27 Prove that any polynomial P(x) of degree m in x may be expressed as a 
polynomial of degree n in (x — a) by the relation: 


(x — a)” Pq). 


P(x) = Pla) + (x — a)P(a)+- - + a 


Use this result to express P(x) = x4 + 2x2 + x + 1in powers of (x — 1). 


12:28 Show that f(x) = exp (arccos x) satisfies the relation 
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(1 — xf" — xf? — f= 0. 
Use Leibnitz’s theorem to obtain the result 
([ — x2)f #2) — Qn + I)xf) — (n2 + F™ = 0. 
Hence write down the first three terms of Maclaurin series for exp (arccos x). 
12:29 Show that f(x) = sin (k arcsin x) satisfies the relation 
(1 — x) ft?) — (2n + 1)xfOt) — (δ — k2)f = 0. 


Hence write down the first four terms of the Maclaurin series for sin (k 
arcsin x) and show that it reduces to the single term x when k = 1. 


12:30 Taking-n = 1, use Taylor’s theorem with a remainder to determine whether 
the following functions increase or decrease with x for x > 0: 
(a) x — tanhx; (b) arctanx — x; (c) ἰορ (1 + x) — x. 

12-31 Write down the Lagrange remainder Rs(x) for each of the following functions: 


(a) f(a + x) = sinh(a+ x); 
(b) fla + x) = sin(@@ + x). 


12-32 Write down the Taylor polynomial P(x) for each of the following functions: 


(a) log (1 + 2x); 

(b) cos (x + x); 

(c) [νὰ — x”); 

(4) α (a > 0). 
12:33 Estimate the error if e* is represented by its Taylor polynomial 
Sa, cae, 
21 31} 4 


in the interval 0 < x < ἐ. 


P(x)=1+x+ 


12:34 How many terms need to be taken in the series 


log (It xy=x- S44 -- ae 
in order to determine log 2 to within an error of 
(a) 0-01 
(b) 0-001. 


12:35 Determine the value of the integral 


0-8 oj 
sin x 
[ ΣΧ ax 
0 x 


accurate to within an error of 0001. 


12:36 Determine the value of the integral 


1 2 
| e— 7 dx 
0 


accurate to within an error of 0-001. 
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12-37 Determine ἔπ to within an error of 0:01 by evaluating the integral 


jy dx 
{ v(l — x?) 


12:38 Expand f(x, y) = χ' + 3xy? + y about the point (1, 1). 


12-39 Write down the first three terms of the Taylor series expansion of f(x, y) 
= εὖ sin y about the origin. 


12-40 Write down the first three terms of the Taylor series expansion of f(x, y) 
= e**Y about the point (1, —1). 


Section 12:4 


12:41 Evaluate the limit 
lim sin? 7x 
yoo 2ev2 —xe 


12:42 Given that f(x, y) = εἴν -- 1 -- y(e* — 1), and that 4 is neither equal to 
1 nor 0, evaluate 


_ f(x, a) 
ey 


12:43 Evaluate the limit 


lim (4: — 25) — x(log 4 — log 2) 
2—0 x? 


12:44 Evaluate the limit 
as cot x — 1/x 
zg \coth x — I/x} 
12:45 Evaluate the limit 
. sec? x — 2 tan x 
im ————. 
trolr 1+ cos 4x 
12:46 This problem is concerned with the derivation of a numerical integration 
formula using five equally spaced ordinates in which the functional value is 


specified at the first and last point, and its derivative is specified at the three 
intermediate points. To be precise, it establishes that 


2h 
Ϊ f'(ddx = = ΩΡ — ΚΟ + 2f(—A)) + Eh), (A) 
—2h 


where the error 


28:5 
Eth) << — 
A< 50 M, 
with M = max | f(x) | for —2h < x < 2h. 
Expand f’(x) in a Maclaurin series with a remainder term of the form 
x4f)(Ex)/4!, where 0 < & < 1, and show that 
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12°47 


12:48 


12:49 


12:50 
12:51 


12-52 


2h 8h3 2h χά 
f' (dx = 4hf’(0) + > fO) + [ π f©(éx)dx. (B) 


—2h 2h 


Show from the Maclaurin series with a remainder term that 


Wf(0) = (f’(h) — 260) + f(—A) -- τ (((ϑ)ιιξχ) + .ὅ.0).)), (C) 


where the term f(yx) arises from the remainder term in the expansion of 
[(--αὐ, andO0< 7 <1. 

Deduce result (A) from (B) and (C), and show that it may also be written 
in the form 


μὴ = κοῦ + Ter) -- [Ὁ τ + BH. 5Θ ὦ 


Using the method of the above problem show that 
h h 
[Foose = 300 + 47 + /(-W) + BO 
—h 


where E(h) < (h°/90)M, with M = max | f(x) | for —h < x <h. Deduce 
that it may also be written in the form 


ὈΞῪ sf) + 4f°0) + f'(—A)) + Eth). 


This result is, of course, Simpson’s rule applied to the derivative f’(x) and 
could have been deduced directly from the result of Section 12:4 (c); con- 
versely, replacing f’(x) by f(x) and f(x) by f(x), this provides an alterna- 
tive derivation of the error term in Simpson’s rule. 


Using the method of Problem 12:46, derive the trapezoidal rule, together 
with its error estimate. Namely, show that 


h 
| foddx = 5 (fO) + fd) + EG), 
«Ὁ 


where 


h i 


with M = max | fo) forO <x <h. 


Use Newton’s method to calculate +/21 accurately to four decimal places by 
seeking the zero of the function f(x) = 21 — x?. Start your iteration with 
Xo = 4. 


Calculate 1523 to four decimal places using Newton’s method. 


There are two real roots of x4 + x2 — 2x — 3 = 0. Between what pairs of 
integers do they lie. Find the positive real root to three decimal places by 
Newton’s method. , 


Locate the pair of integers between which lies the one real root of the 
equation 


x3—x—1=0. 


Determine the value of this root to four decimal places. 


12:53 
12:54 


12°55 


12:56 


12:57 


12:58 


12:59 


12-60 


12-61 


12°62 


12:63 


12:64 
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Find the positive root of sin x — $x = 0 to three decimal places. 
Locate and identify the nature of the stationary points of the function 
fy) = x3 4 3xy? — 15x — 12y. 
Find the functional value of any maxima or minima. 
Locate and identify the stationary points of the function 
Sx, y) = 2x4 — 3x2y? + y4 + Bx? + 3y?. 
Locate and identify the stationary point of the function 
f(x, y) = x8y?6 — x — y) 
which lies in the first quadrant x > 0, y > 0. 
Locate and identify the stationary points of the function 
I y) = xP + γ8 — Ixy. 


By considering the proof of Theorem 12:14 show that the conditions stated 
in (c) are equivalent to, and may be replaced by, 


(c) hoon yo) «0 
fu(xo, yo) > 0. 


If 2s denotes the perimeter of a triangle with sides of length a, 5, c, then its 
area A is given by the formula 


A= v/[s(s — ΟἹ — δ) — c)]. 
Show that for a given perimeter, the equilateral triangle has maximum area. 
(Hint: Consider the function F(a, δ) = s(s — ays — δ)γία + b — 5).} 


Locate and identify the extrema of f(x, y) = 6 — 4x — 3y subject to the 
constraint x? + y? = 1. Interpret the problem and your result in geometrical 
terms. 


Locate and identify the stationary point of f(x, y) = x? + y? subject to the 
constraint (x/2) + (y/3) = 1. 


Locate and identify the stationary points of f(x, y) = cos? x + cos? y 
subject to the constraint y — x = $7. 


Use the method of least squares to fit the straight line Y = a + bx to the 
four points (0, 0-1), (1, 1:1), (2, 1-6), (3, 3°3). 


Use the method of least squares to fit the quadratic Y = a + bx + cx? to 
the six points (0, 0-7), (1, 3-4), (1.5, 5-0), (2-0, 8-1), (2-4, 11-5), (3-0, 21-0). 


Differential equations 
and geometry 


131 Introductory ideas 


Special examples of differential equations have already been encountered; 
for example, those that gave rise to the exponential function and to the sine 
and cosine functions. It is now appropriate to make a systematic study of 
certain differential equations that are both useful and of frequent occurrence. 
We shall begin by examining a number of simple examples to illustrate the 
basic ideas. 

Any equation involving one or more derivatives of a differentiable func- 
tion of a single independent variable is called an ordinary differential equa- 
tion. The following related equations taken from elementary dynamics are 
familiar examples. 


d?x 
az — 8 
d 


τ =u+gt (velocity equation). 


(acceleration equation) 


They describe, respectively, the acceleration and velocity of a particle falling 
freely under the action of gravity. Here g is the acceleration due to gravity, 
x 1s the distance of the particle from a fixed origin in its line of motion at 
time f, and w is the initial velocity of the particle. In these simple equations 
the dependent variable is represented by the displacement x and the 
independent variable by the time 1. The integration of these equations is 
elementary and already familiar to the reader, who will also recognize that 
the velocity equation is in fact the integral of the acceleration equation with 
the arbitrary constant of integration set equal to the initial velocity u, since 
the velocity equation must describe the velocity at the start of the motion 
(when { = 0). 

The first step in a systematic study of useful ordinary differential equa- 
tions, aimed at producing general methods of solution wherever possible, is 
a straightforward classification of the equations. This we achieve by associ- 
ating two numbers with each equation which we shall refer to as its order 
and its degree. We define the order of an ordinary differential equation to be 
the order of the highest derivative appearing in the equation, and the degree 
to be the exponent to which this highest derivative is raised when fractions 
and radicals involving y or its derivatives have been removed from the equa- 
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tion. Clearly, the notion of degree is only applicable when a differential 
equation has a simple algebraic structure allowing such a classification to be 
made. Thus both the simple dynamical equations just described are of degree 
1, but the acceleration equation is of second order whereas the velocity 
equation is of first order. These are, in fact, examples of a specially important 
class of equations known as linear differential equations. 

All linear differential equations are characterized by the fact that the 
dependent variable and its derivatives only occur with degree 1, whilst the 
coefficients multiplying them are either constants or functions of the inde- 
pendent variable. Thus of the following three second order differential 
equations, only the first two are linear, since the last involves the non-linear 
product y(dy/dx). In general, differential equations that are not linear are 
termed non-linear. 


ἀν ὦν 
—+3—42y=0 
age ee y : 

d?y dy 

9... ed ΘΟ ΒΕ 
δα Ἐπ + (x n*)y = 0, 
déy dy 


The classification of a more complicated differential equation is illustrated 
by the following example, involving both fractions and radicals, in which 
Κ is a constant: 


(y"/3/2 
Pay): 
Clearing the fractions and radicals gives rise to the ordinary differential 
equation 
k2y"4 — y"3 4 2k2py"2 4 Key? = 0, 


showing that the order is 2 and the degree is 4. 

If γ΄, y",. . ., y\™ respectively denote successive derivatives, up to order 
n, of a differentiable function y(x) with independent variable x, then a general 
nth order ordinary differential equation has the form 


F(x, y, yy.» 5 y'™) = 0, (13-1) 


where F is an arbitrary function of the variables involved. 


DEFINITION 13-1 (solution of differential equation) A solution of the 
ordinary differential Eqn (13-1) is a function y = g(x) that is differentiable 
a suitable number of times in some interval J containing the independent 
variable x, and which has the property that 
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F(x, ¢(x), (x), . . ., P™(x)) = 0 (13-2) 
for all x belonging to J. 


Notice that it is important to define the interval J since the differential 
equation does not necessarily describe the solution for unrestricted values of 
the argument x. 

Thus a solution of the velocity equation just used as an example would be 
a differentiable function x = φ(ῇ defined for some interval 7 of time ¢ with 
the property that 

¢'(t) - gt—u=0, | (13-3) 
for all ¢ in the interval 1. In this case J would be of finite size since the particle 
could not fall for an unlimited time without being arrested by contact with 
the ground, after which the ordinary differential equation giving rise to 
solution Eqn (13-3) would no longer be valid. 

The prefix ordinary is used to describe differential equations involving 
only one dependent and one independent variable, in contrast with partial 
differential equations, which involve partial derivatives, and so have at least 
two independent variables and may also contain more than one dependent 
variable. Normally, when the type of differential equation being discussed is 
clear from the context, the adjectives ‘ordinary’ and ‘partial’ are omitted. 

It is possible to develop the theory of differential equations in considerable 
generality, but our approach, as mentioned before, will be to examine a 
number of useful special forms of equation. We shall, however, first examine 
a few of the ways in which important forms of ordinary differential equation 
may arise. 


13:2. Possible physical origin of some equations 


At this stage it will be useful to illustrate some typical forms of differential 
equation, showing their manner of derivation from physical situations. We 
shall consider a number of essentially different physical problems and in each 
case take the discussion as far as the derivation of the governing differential 
equation. 


Example 13-1 Experiment has shown that certain objects falling freely in 
air from a great height experience an air resistance that is proportional to the 
square of the velocity of the body. Let us determine the differential equation 
that describes this motion, and for convenience take our origin for the time ἢ 
at the start of the motion. We shall assume that the body has a constant mass 
mm and that at time ¢ the velocity of fall is v, so that the air resistance at time f 
becomes Av? units of force, where 4 is a constant of proportionality. 

Now by definition, the acceleration a is the rate of change of velocity, so 
that a = dv/dt and, since the body has constant mass m, it immediately 
follows from Newton’s second law that the force accelerating the body is 
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m(dv/dt). To obtain the equation of motion this force must now be equated 
to the other forces acting vertically downwards which are, taking account of 
the sign, the weight mg and the resistance —Av2. The equation of motion is 
thus | 


dv 
m di mg v 


or, dividing throughout by the constant m, 


which is a special case of a differential equation in which the variables are 
separable. A general differential equation of this form involving the inde- 
pendent variable x and the dependent variable y can be written in either of 
the two general forms 


ἄν ι 

ας - MG). NO) wee) 
or 

P(x)Q(y)dx + R(x)S(y)dy = 0. (13-5) 


Example 13-2 In many simple chemical reactions the conversion of a raw 
material to the desired product proceeds under constant conditions of temper- 
ature and pressure at a rate directly proportional to the mass of raw material 
remaining at any time. If the initial mass of the raw material is Q, and the 
mass of the product chemical at time ¢ is g, then the unconverted mass re- 
maining at time 1 150 — g. Then, if —k(k > 0) denotes the proportionality 
factor governing the rate of the reaction, the reaction conversion rate 
d(Q — g)/dt must be equal to —k times the unconverted mass Ο — q. The 
desired reaction rate equation thus has the form 


d 
7, (2 —q) = —k(Q — 4), 


where the minus sign has been introduced into the definition of k to allow 
for the fact that Q — q decreases as ὦ increases. 


Example 13-3. A simple closed electrical circuit contains an inductance L 
and a resistance R in series, and a current i is caused to flow by the application 
of a voltage Vo sin wt across two terminals located between the resistance 
and inductance. The equation governing this current i may be obtained by a 
simple application of Kirchhoff’s second law, which tells us that the algebraic 
sum of the drops in potential around the circuit must be zero. Thus, since 
the driving potential is Vo sin wt and the changes in potential across the 
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inductance and resistance are in the opposite sense to i and so are, respectively, 
—L(di/dt) and — Ri, it follows that 


"ἰδ = RO 
dt 


or, 


di Rox 
Εν ee 


The final equations of Examples 13-2 and 13-3 are both specially simple 
cases of linear first order differential equations. If the dependent variable is 
denoted by y and the independent variable by x, then all linear first order 
differential equations have the general form 

dy 


τ + Pixly = OC). (13-6) 


Example 13-4 Mechanical vibrations occur frequently in physics and 
engineering and they are usually controlled by the introduction of some suit- 
able dissipative force. A typical situation might involve a mass m on which 
acts a restoring force proportional to the displacement x of the mass from 
an equilibrium position, and a resistance to motion that is proportional to 
the velocity of the mass. Such a system, which to a first approximation could 
represent a vehicle suspension involving a spring and damper, is often 
tested by subjecting it to a periodic external force F cos wt in order to simulate 
varying road conditions. In this situation the displacement x would represent 
the movement of the centre of gravity of the vehicle about an equilibrium 
position as a result of passage of the vehicle along a road with a sinusoid 
profile. If the resisting force Fa has a proportionality constant k, and the 
restoring force F, has a proportionality constant A, then Fg = k(dx/df) and 
F, = Ax. Applying Newton’s second law, as in Example 13-1, and equating 
_ forces acting on the system we obtain the equation of motion 


d2x dx 
m a2 Foos wt —k ay x 
or, 
d2x kdx A 


F 
— + —~-— + — x =— cos wt. 
dt2 mdt om m 


This is a particular case of a linear constant coefficient second order 
differential equation, all of which have the general form 


d?y dy 
τὰ t a5 t+ by =f), (13-7) 
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where x is the independent variable, y the dependent variable, and a and 5 
are constants. Equations (13-6) and (13-7) are said to be inhomogeneous when 
they contain a term involving only the independent variable; otherwise they 
are said to be homogeneous. The differential equation of Example 13-2 is 
thus homogeneous of order 1 with dependent variable (Q — 4), whilst that 
of Example 13-4 is inhomogeneous of order 2; both are linear and involve 
constant coefficients. If in Example 13-2 the temperature of the reaction were 
allowed to vary with time, then in general the velocity constant k of the 
reaction would become a function of the time ¢ and the equation would assume 
the homogeneous form of Eqn (13-6) with a variable coefficient. 

The special importance of the types of differential equation singled out 
here lies in their frequent occurrence throughout the physical sciences. We 
Shall later proceed with a systematic study of solution methods for these 
standard forms, together with other common cases of interest. 


13-3 Arbitrary constants and initial conditions 


If we consider the simple differential equation 


ἀξχ 
qe (13:8) 
then a single integration with respect to time gives dx/dt = gt as a possible 
first integral. This is certainly a solution of Eqn (13-8) in the sense defined in 
Egn (13-2), but it is not the most general solution since 


ae Ξε οἱ + gt, (13-9) 
where ci is an arbitrary constant, is also a solution. This specific example 
illustrates the general result that in order to obtain the most complete form 
of solution, each integral involved in the solution of a differential equation 
must be interpreted as an antiderivative or, more loosely, as an indefinite 
integral. When maximum generality is sought the result is termed the general 
or complete solution of the differential equation. It is, therefore, important 
that when obtaining the general solution of a differential equation, an arbi- 
trary constant should be introduced immediately after each integration. 
Thus the general solution of Eqn (13-8), which is obtained after two 
integrations, is 


X=Cc2+ c1t + ἐρι, (13-10) 


where Ce is another arbitrary constant. 

These arbitrary constants may be given definite values, and a particular 
solution obtained, if the solution is required to satisfy a set of conditions, at 
some starting time ¢ = fo, equal in number to the order of the differential 
equation. If, for example, Eqn (13-8) describes the acceleration of a body 
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falling under the influence of gravity, and air resistance may be neglected, 
then Eqn (13-10) is the general solution of the problem of the position of the 
body at time ¢. In the event that the body started to fall with an initial velocity 
u at time ¢ = 0, it follows from Eqn (13-9) that cy = u. Similarly, if the body 
was at position x = xo at time t= 0, it follows from Eqn (13-10) that 
C2 = Xo, and so the particular solution corresponding to the initial conditions 
x = Xo, dx/dt =u at t = 01s 


x = Xo + ut + het? (13-11) 


General starting conditions of this type are known as initial conditions by 
analogy with time dependent problems such as this in which the solution 
evolves away from some known initial state. On occasion it is convenient to 
write initial conditions in an abbreviated form which we illustrate by re- 
peating the initial conditions that gave rise to solution Eqn (13-11): 


dx 
= X05 _ 
t=0 dt 


Something more of the role of arbitrary constants may be appreciated if 
they are eliminated by differentiation from a general expression describing a 
family of curves in order that the differential equation describing the family 
may be obtained. Suppose, for example, that a general two-parameter family 
of curves is defined by the expression 


x = 
t=0 


y(x) = A cosh 2x + B sinh 2x, 


where A and B are arbitrary constants (which we now regard as parameters). 
Then differentiation shows that 


γ' = 2(A sinh 2x + Bcosh 2x) and γ' ΞΞΞ 4(A cosh 2x + B sinh 2x); 


from which it follows that elimination of A and B gives the differential 
equation 


y= 4y: 

This is the differential equation that has the two-parameter family of curves 
y(x) as its general solution. | 

We should now see whether, having found a particular solution of a 
differential equation with given initial conditions, this is indeed the only 
possible solution. This is called the uniqueness problem for the solution and is 
obviously important in physical applications. To answer uniqueness questions 
for general classes of differential equations is difficult, but in the case of the 
dynamical problem just discussed a simple argument will suffice. 

Let v = dx/dt denote the velocity of the body so that Eqn (13-8) takes the 
form dv/dt = g. Now suppose that some other function w is also a solution 
of Eqn (13-8), satisfying the same initial conditions, so that dw/dt = g and 

=w=u at t=0. Then, setting V =v — w, it is easily established by 


SEC 13-3 ARBITRARY CONSTANTS AND INITIAL CONDITIONS / 603 


subtraction of the two linear differential equations that the differential 
equation satisfied by the difference between the two postulated solutions is 
dV/dt = 0, thereby showing that V = constant. However, as the initial 
conditions require that V = 0, it follows at once that w = v. The velocity is 
thus uniquely determined by the differential equation and the initial condition. 
To complete the proof that the position is also uniquely determined it is only 
necessary to apply the foregoing argument to the velocity equation dx/dt = 
u + gt obtained by direct integration of dv/d¢t = g. This matter of uniqueness 
will be taken up again in the next section in connection with some useful 
geometrical ideas. 

ΤῈ is not always necessary, or indeed possible, to prescribe only initial 
data for a differential equation, as we now illustrate by reformulating the 
previous example. 

We have seen that the velocity v of the body is uniquely determined by 
Eqn (13-9) once it has been specified at some given instant of time. Similarly, 
when the velocity is known, the position is uniquely determined by Eqn 
(13-10) once it has been specified at some given instant of time. Velocity and 
position were specified at the same instant of time in the initial value problem 
just discussed; we now illustrate an alternative problem that could equally 
well have been considered. 

As the solution (13-10) implies the result (13-9), it would be quite permis- 
sible to determine a particular solution by requiring the body to be at the 
positions xo and x; at the respective times fo and 1. These conditions would 
enable the determination of the arbitrary constants c; and ce and would, of 
course, completely determine the velocity. Conditions such as these that are 
imposed on the solution of a differential equation at two different values of 
the independent variable are called two-point boundary conditions. This name 
is derived from the fact that in many important applications the conditions 
to be imposed are prescribed at two physical boundaries associated with the 
problem. | 

In the simple initial value problem discussed here the question of the 
existence of a solution was never in doubt since we were able to find the 
general solution by direct integration. This is not usually the case; with more 
complicated differential equations the first question to be asked is ‘does a 
solution exist’ and, if so, ‘is it unique’. 

To illustrate this let us again consider Eqn (13-8), but this time with 
different two-point boundary conditions. At first sight it might appear 
reasonable to specify the velocity rather than the position at two different 
times, but a moment’s reflection shows that this is not possible. This arises 
because Eqn (13-9) determines the velocity, and unless the two pre-assigned 
velocities were in agreement with this equation there could obviously be no 
solution satisfying such two-point boundary conditions. Furthermore, even 
if they were in agreement with Eqn (13-9), only the arbitrary constant οἱ 
would be so determined, leaving an infinity of solutions Eqn (13-10) of the 
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original differential equation corresponding to arbitrary values of coe. 
Having made this point we shall not pursue it further in this first course 
on differential equations. 


13-4 Properties of solutions—isoclines 


Before discussing methods of solution of differential equations the general 
notion of a solution already outlined must be more fully discussed. Then, 
after examining the differentiability of solutions in this context, we shall, as a 
preliminary to developing special methods of solution, illustrate how the 
properties of particular differential equations may be explored by the applica- 
tion of some useful geometrical ideas. 

The concept of a solution given in Definition 13-1 implies that for a 
differential equation of order n, the solution must be differentiable at least 
n times within the interval J of its definition. This is easily seen by writing the 
equation of order n displayed implicitly in Eqn (13-1) in the explicit form 


ye τὸ f(s VON), (13-12) 
and then using the fact that it 15 necessary for y to be differentiable n — 1 
times for the arguments γ΄, y”,. . ., y™~)) of the function f to exist. 


To illustrate these ideas we shall consider the solution of the differential 
equation 


within the interval J determined by —1< x< 1. It 1s easily verified by 
differentiation that 


γι) =cosx+sinx and  yo(x) =cosx + 2sin x 


are both solutions of the differential equation over the interval 1. However, 
the function )(x) defined on J by 


cosx+sinx (—1l<x<0) 

Wo) cosx+2sinx (0<x< 1) 
is not a solution of the differential equation, because although y(x) is con- 
tinuous over J, dy/dx has a discontinuity at x = 0 and, since the equation is 
of second order, we must require of its solution that at least y and y’ be con- 
tinuous over J. This example introduces into the context of differential equa- 
tions the idea of left- and right-handed derivatives already encountered in 
connection with continuity and differentiability. Here y(x) has the property 
that y'(O—) = 1 and y'(0+) = 2. 

A solution of a differential equation must also be finite in its interval of 
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definition and sometimes it is necessary to restrict the interval to achieve this. 
The simple differential equation 


clearly has as its solution the function y = tan x whenever x Ξῷ (2n + 1)47. 
Thus y = tan xis a solution in any open interval contained in Jy = [(2n — 1)$z 
< x < (2n + 1)$7]). In particular, a solution exists in the interval (—47, 477) 
but not in the interval (— $7, 37]. In examples of this type, the behaviour of 
the solution in the neighbourhood of the end points of the intervals of defini- 
tion means that the solutions in adjacent intervals cannot be connected across 
the points of discontinuity. 

These ideas are greatly clarified if the notion of an isocline is introduced, 
which we now do in relation to the first order differential equation y’ = 
f(x, y). From the ideas already formulated about the nature of a solution on 
a specified interval /, it is clear that when initial conditions x = xo, y = yo 
are specified, the differential equation then assigns to the derivative of the 
solution at x = xo the value ()’)o = f(Xo, yo). Alternatively, representing 
the solution by a curve in the (x, y)-plane, we may use geometrical terminology 
and speak instead of the initial point (xo, yo), and of the derivative of the 
solution (y’)o = f(Xo, yo) at that point. From the geometrical interpretation 
of a derivative this implies that the tangent to the solution curve through the 
point (x0, yo) is inclined to the x-axis at an angle 0 = arctan f(xo, yo). For 
any particular solution yp through an initial point (xo, yo), the derivative 
Yp (x) will, in general, have different values associated with different values of 
the independent variable x, and so far we have no way of determining how yp 
varies with x. However, if instead of a particular solution the general pro- 
perties of all solutions of the differential equation: γ΄ = f(x, y) are to be 
considered, then the equation may be regarded as assigning to each point 
of the (x, y)-plane a specific value of y’, and hence also a specific angle of 
inclination 6 for the tangent to the solution curve through that point. If the 
derivative y’ is set equal to a constant value K, then the equation K = f(x, y) 
determines those curves in the (x, y)-plane along which the tangents to the 
solutions all have a constant angle of inclination 69 = arctan K to the x-axis. 
Because of the equal angle of inclination of the tangents to the solution 
curves that pass through points on these lines, the lines themselves are termed 
isoclines. Different values of K determine different isoclines, and it is quite 
possible that no isoclines exist for some values of K. 

It is, of course, important to recognize that an isocline is only a curve 
characterizing a special property of all the solutions of the differential 
equation and that it is not itself a curve representing a solution. 

In principle, the simple idea of an isocline provides an approximate pro- 
cedure for the determination of a numerical solution of a general first order 
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dy 


6 = arctan f(x,, yo) 


dx = Ax 


Fig. 131 Graphical solution of γ΄ = f(x, y), y(xo) = yo. 


differential equation. If we know that the solution passes through the initial 
point (xo, yo), and it is required to determine the solution as far as x = d, 
then we start by sub-dividing the interval [xo, d] into ἢ sub-intervals. Letting 
xi = x9 + /Ax, where / = 1, 2,...,n and Ax = (d — xo)/n, we determine 
yi, the approximate solution at x = x1, by drawing through the point 
Po(Xo, Yo) ἃ line with gradient f(xo, yo), and setting γι equal to the value of 
y at point Pi where the tangent line intersects the ordinate through x = x1 
(Fig. 13-1). This process is repeated until x becomes equal to x, = 4, when the 
numbers yi, y2,.. ., Yn represent approximations to the true solution y at 
the points x1, X2,. . ., Xn. Naturally, the accuracy of these numbers depends, 
in part, on the number ἢ of sub-divisions that has been chosen. 

The polygonal line obtained by joining adjacent points (x;, yi) by straight 
line elements is called the Cauchy polygon approximation to the solution, 
after A. L. Cauchy who was the first to establish its convergence to the true 
solution as Ax — 0. 

This process may be mechanized by the use of differentials together with 
a suitable notation. To see this let us introduce the differentials dx and dy; 
through the equation 


ἄν; = f(x, ya)dx, 


with dx = Ax, so that at (xi, yi) the ratio dy;/dx of the finite quantities dx 
and dy; is equal to the value of v’(xi). This leads to the general result 


Yin = yi t dy 
or, equivalently, 


Vier = vi + f(x, yi)dx. 
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As a particular case of this we have 
v1 = yo + f(Xo, yo)dx, 


which is just the first step of our earlier argument alternatively expressed. 
Other arguments show that the magnitude of the error arising at each step 
is of the order of (dx)?. 

The following example applies this result to the numerical integration of a 
simple differential equation over five equal steps Ax = 0-1 though, if desired, 
the interval Ax may be varied from step to step. 


Example 13-5 Let us find the approximate value of y at x = 0-5, given 
that ν΄ = xy and γί(θ) = 1. 


Solution If we take five equal sub-intervals, so that Ax = 0-1, the results 
of the calculations based on the relation dy; = 0:1 xij; will be as follows: 


i Xi γι dy: eft” 
0 0 Ι 0 ] 

1 0-1 1 0-01 1-0050 
7 0:2 1-01 0-0202 1-0202 


3 0.3 1-0302 0-0309 1:0460 


4 0-4 1:0611 0:0424 1-0833 


A comparison of the third column with the final column, which tabulates the 
exact solution y = οὐδ΄, demonstrates the relatively poor accuracy obtainable 
by this simple approach, known as Euler’s method for the numerical integra- 
tion of a differential equation. The approximate value y(0-5) = 1-1035 
obtained by Euler’s method is seen to be already 2-6 per cent low, and attempts 
to determine y for values of x > 0-5 would result in a very rapid growth of 
error. The Cauchy polygon is compared with the exact solution in Fig. 13-2. 
Later we shall show how a simple modification to this method will produce 
a considerable improvement. 

Returning to the subject of isoclines we shall now utilize several examples 
to illustrate some typical situations. As a solution curve arises as an integral 
of the original differential equation, it is customary to refer to the solution 
curves as integral curves. 
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1-12 


1:10 


1-08 


1-06 


1-04 


1-02 


i ae oe wi a 


Fig. 13-2 Comparison of exact solution and Cauchy polygon. 


Example 13-6 Consider the simple differential equation y’ = x + 1, 
which is easily seen to have the general solution y = 4x? + x + C. Setting 
γ' = K then shows that the isoclines of this differential equation are the lines 
x = K — 1. Representative isoclines are illustrated in Fig. 13-3 as the full 
vertical lines. Short inclined lines have been added to these isoclines to indi- 
cate the direction of the tangents to the integral curves that intersect the 
isoclines; their angles of inclination have the magnitude arctan K. Three 
integral curves, represented by curved full lines, have been drawn to show the 
relationship between isoclines, the tangents or gradients associated with 
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Fig. 13-3 Isoclines, direction field, and integral curves. 


isoclines, and the integral curves themselves. The pattern of these tangents 
associated with the isoclines shows the direction taken by integral curves and 
is accordingly termed the direction field associated with the integral curves. 

Figure 13-3 also serves to illustrate the geometrical analogue of Euler’s 
method; namely, to use a map of the isoclines, each marked with their 
associated tangents indicating the direction field of the integral curves, in 
order to trace a solution that starts from a given point and always intersects 
each isocline at an angle equal to the gradient associated with it. 

It is easily seen that with the simple equation y’ = x + 1 there are no 
points in the finite (x, y)-plane at which the gradient is either infinite or 
ambiguous. The next two examples show more complicated situations 
involving characteristic behaviour of direction fields and integral curves at 
special points. 


Example 13.-7 In the case of the differential equation y = (1 — y)/(1 + x), 
the general solution determining the integral curves is y = 1 + C/(1 + x). 
As always, the isoclines are determined by setting γ΄ = K in the differential 
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important form of behaviour, and any family of integral curves having this 
property are said to have a saddle point at P. 


Example 13-8 A direction field of a different kind is provided by the differ- 
ential equation y’ = 2y/x, which has the lines » = $Kx as its isoclines and 
the curves y = Cx? as its integral curves. Their inter-relationship is illustrated. 
in Fig. 13-5, which also shows quite clearly that the singular point at the origin 
is of an essentially different kind to that of the previous example. Again the 
isoclines all pass through this point but, whereas in Example 13-7 there was 
only one degenerate integral curve through the point P, in the present case 
every integral curve passes through the singular point. The parabola-like 
behaviour of the integral curves in the vicinity of the origin is characteristic 
of a different form of singularity, and integral curves with this general pro- 
perty are said to have a node at the common point. 

The last two examples also serve to illustrate that initial conditions to 
differential equations may not always be prescribed arbitrarily without 
reference to the equation in question, since there may either be no solution 
or an infinity of solutions satisfying a differential equation and arbitrarily 
prescribed initial conditions. For example, no integral curve passes through 
the point (—1, 2) in Fig. 13-4, whereas every integral curve passes through the 
point (0, 0) in Fig. 13-5. Since, in the first case, solutions have infinities along 
the line x = —1, and in the second case, the direction field is indeterminate 
at (0, 0), this suggests that for a unique solution to exist the isoclines must be 
well behaved and free both from points at which infinite gradients occur and 
points of intersection giving rise to indeterminacies of gradient in the direction 
field. 

To make these ideas a little more precise let us use the following simple 
argument to suggest the form of a general existence theorem for solutions of 
the general first order differential equation 


γ' = f(x, 7), (13:13) 


in some small interval [a, ὁ] containing the point x = xo at which we require 
JY = Vo. 

Setting K = f(xo, yo) and assuming K to be finite, the corresponding 
isocline is then defined by the implicit functional relationship K = f(x, y) or, 
alternatively, by F(x, y) = 0, where F (x, y) = f(x, y) — K. 


By our earlier work on implicit functions we know that a unique relation- 
ship y = @(x) defining the isocline may be obtained in the neighbourhood 
of some point (xo, yo), provided the partial derivatives F, and Fy are continu- 
ous in the neighbourhood of (xo, yo) and F(x, yo) 4 0. However, since 
K is constant for the particular isocline in question, ἔν - ἧς and Fy = fy, 
and so we may conclude that the continuity of f; and f, in the neighbourhood 


Fig. 13-5 Node at origin. 


of (xo, Vo), together with the condition fy(xo, yo) τέ 0, will ensure that locally 
there is a unique isocline with the associated gradient K. Consequently, there 
is no singularity of the direction field near (xo, yo), and so an argument such 
as the Euler method will yield a solution in the neighbourhood of (xo, yo). 
In reality the simple argument used here has resulted in conditions to be 
applied to the function f(x, y) that will certainly ensure the existence of a 
unique solution, so they are sufficient conditions; nevertheless, we shall show 
that they are too restrictive, and so are not all necessary conditions. 

That the conditions are sufficient, but not necessary, is easily demon- 
strated by appealing to Example 13-6, in which f(x, y) = x + 1. We already 
know that the general solution is y = 4x2 + x + C and so always exists, 
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but although fz = 1 and f, = 0 are both continuous functions, the result 
fy = 9 violates the supplementary condition that f,(xo, yo) 4 0. Thus this 
condition is clearly not a necessary one. 

More subtle methods of analysis give rise to the following less restrictive 
theorem which, although satisfactory for most practical purposes, is still 
only a statement of sufficient conditions. 


THEOREM 13-1 If the functions f(x, y) and f,(x, y) are continuous in a 
rectangle R of the (x, y)-plane containing the point (xo, yo) then, for some 
sufficiently small positive number ἡ, there exists a unique solution y = y(x) 
of the differential equation 


y =f y) 


that is defined on the interval x9 —~A<tx<i xo +A and is such that 
V(xo) = Yo. 


In effect this theorem asserts that when the stated conditions are satisfied, 
a unique integral curve passes through each point of the rectangle R. We shall 
not pursue these arguments further, but they are obviously of importance 
when used in connection with discussions involving differential equations of 
unfamiliar type to determine whether solutions, once obtained, are unique. 

An application of the conditions of the theorem to the three previous 
examples shows that the first satisfies them everywhere in the finite plane, 
the second has infinities in fand f, along x = —1, and the third has infinities 
in f and f, along x = 0. Consequently Example 13-6 has a unique integral 
curve through every point of the finite plane, whereas in Examples 13-7 and 
13-8 the respective lines x = —l and x =O must be omitted from the 
(x, y)-plane; a unique integral curve then passes through all the remaining 
points of the finite plane. 


Example 13-9 The use of isoclines in the determination of properties of 
solutions of differential equations can often be supplemented by other useful 
information obtainable directly from the equation. We illustrate this by 
considering the differential equation 


γ' Ξε γ + 4X + e%, 
which is seen to have tsoclines determined by the equation 
y= ΚΑ -- ἔχ - ε- 5, 


Having constructed a set of isoclines together with the associated direction 
field of tangents, we notice first that the extrema of the integral curves will 
occur along the isocline y = —}x — e-* corresponding to y’ = K =0. 
This isocline, together with several others, is shown in F ig. 13-6 (a), in which 


(a) 
Fig. 13°6 Integral curves and curves characterizing extrema of solutions: (a) 
approximate integral curves; (b) exact integral curves. 
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short inclined lines have again been used to indicate the direction field associ- 
ated with the isoclines. 

Additional information may be obtained by seeking the location of 
points of inflection of the integral curves which, when they occur, must 
coincide with the vanishing of γ΄. This information may be obtained directly 
from the differential equation itself if we first differentiate it with respect to 
x to obtain 


ye = y + 4—e-%, 
and then substitute for y’ to obtain 


y Ξνπ ΚΓ +>). 


Hence the points of inflection will lie along the line 


y= —3(1 + x), 


which is shown as a chain-dotted line in Fig. 13-6 (a). 

Then, using the property of isoclines and the associated direction field, 
it is possible to sketch representative integral curves. Taking points A, B, 
and C in Fig. 13-6 (a) as typical points in the (x, y)-plane, three approximate 
integral curves have been constructed using the graphical method discussed 
earlier. Although these integral curves contain substantial errors, due to the 
small number of isoclines, they nevertheless illustrate the general behaviour 
of solutions of the differential equation 


yproytixte™. 


As already remarked, the choice of a point in the (x, y)-plane through 
which to begin the construction of an integral curve is equivalent to specifying 
initial conditions for the differential equation. Namely, x and y are initially 
assigned the coordinates xo, yo of the chosen point. It 1s apparent that al- 
though we have determined the solution for increasing x, by constructing 
tangents in the direction of decreasing x, a solution could equally well have 
been found for x <x, provided that no singular point lies on the integral 
curve in question. 

In this case no ambiguity or infinity of derivatives occurs in the finite 
plane, so that the solution of this differential equation contains no singu- 
larities. 

Using a method described in the next chapter it is easily established that 
the general solution of the differential equation just discussed is 


y = Cet — fe-* — (1 + 4), 


and representative curves are shown in Fig. 13-6 (Ὁ) corresponding to the 
indicated values of C. These curves illustrate, as do those of Fig. 13-6 (a), 
that the nature of the extrema differ from curve to curve. Thus the lower three 
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integral curves in Fig. 13-6 (Ὁ) possess absolute maxima but no points of 
inflection, whereas the upper three integral curves possess points of inflection 
but neither maxima nor minima. However, the integral curve corresponding 
to C = 0:1 possesses a local maximum at P, a point of inflection at Q, and a 
local minimum at R. The line of points of inflection is again shown as a 
dotted line, whilst the line of extrema (the tsocline for which K = 0) is shown 
as a chain-dotted line. 


13:5 Orthogonal trajectories 


The notion of an isocline helps to provide a simple solution to the problem 
of the determination of trajectories orthogonal to a given family of curves. 
Expressed another way this is asking how, if a family of curves determined 
by a parameter « is specified in the form 


F(x, y, «) = 0, (13-14) 
may another family of curves 
G(x, y, B) = 0 (13:15) 


with parameter βὶ be determined so that each curve of the family G intersects 
all curves of the family F at right angles. 

Questions of this nature are common in many branches of science and 
engineering and indeed they have already been encountered in Chapter 11 in 
connection with potential theory. Similar questions occur in magnetism and 
heat conduction. We shall now solve the general problem we have formulated 
for plane one-parameter families of curves, as systems of the type (13-14) 
and (13-15) are usually termed. 

In Section 13-3 we have already seen how, by differentiation, an arbitrary 
constant « may be eliminated from a one-parameter family of curves of 
the form (13-14), thereby giving rise to the differential equation that 
characterizes all the curves of the family. In general this will have the form 


y =f, y) (13-16) 


which, as we have just seen, then defines the direction field of all members of 
the family of integral curves represented by Eqn (13-14). Elementary co- 
ordinate geometry tells us that the product of the gradients of orthogonal 
straight lines must equal —1 and so, at every point of intersection of curves 
from the orthogonal families F and G, the product of the gradients of the 
tangents to these curves must also equal —1. Consequently, the differential 
equation of the trajectories of family G that are orthogonal to those of F is 


dy ΒΗ ele, (13-17) 
dx f(x, y) 


Example 13-10 Let us determine the trajectories orthogonal to the family 
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of parabolas y? = ax, in which a is an arbitrary parameter. In the notation 
of Eqn (13-14) this is equivalent to F(x, y, a) = 0, with F(x, y, a) = γ᾽ — ax 
and the parameter « set equal to a. First, we must obtain the differential 
equation characterizing this family of curves, by differentiation and elimina- 
tion of a. Differentiating F(x, y,a) = 0 with respect to x gives 2yy’ = a, 
which on elimination of a by use of the original equation gives the differential 
equation γ΄ = y/2x. The next step is to use this differential equation of the 
family of parabolas to determine the differential equation of the family of 
curves forming the orthogonal trajectories. As the gradient of the parabola 
through the general point (x, y) is y/2x, we see by Eqn (13-17) that the 
gradient of the orthogonal trajectory through the same point must be 
—2x/y. Thus from Eqn (13-17) the differential equation of the trajectories 
orthogonal to the parabolas is seen to be y’ = —2x/y. This equation is of 
the form variables separable already mentioned and the final step in the deter- 
mination of the actual family of orthogonal trajectories is the integration of 
this equation. We shall postpone discussing the actual method to be used 
until the next chapter. Nevertheless, it is easily verified by differentiation that 
the solution is the family of ellipses x2 + $y? = C*, where C? is a positive 
parameter. 


136 Modified Euler method 


The Euler method for the numerical solution of a first order differential 
equation provides a means of determining the solution of an initial value 
problem but, as we have already seen in an example and several problems, 
the accuracy is poor. We now show that attention to the geometrical implica- 
tions of the method can greatly improve its accuracy. In Fig. 13-1 the gradient 
appropriate to point Po was used to determine the change dyo in the functional 
value over the entire interval dx = Ax. This is obviously only a first approxi- 
mation to the true situation, and a better approximation to the increment 
in y consequent upon a step Ax would be provided by using the average of 
the gradients at Po and P; in place of f(x%o, yo) in the Euler method. This 
simple refinement applied to the previous argument is known as the modified 
Euler method in which the error at each step is of the order of (dx). 

The proposed modification is shown diagrammatically in Fig. 13-7, in 
which the full straight lines passing through points Po and P, have respective 
gradients mo = f(xo, yo) and m, = f(xo + dx, yo + dyo). Then, if the dotted 
line through Po has gradient mo’ = $(mo + m1), the improved approximation 
dyo’ to the increment in y is simply dyo’ = mo’ dx. In terms of the angles 6, 
θυ, and 6; defined in the figure, tan 6 = d{tan 09 + tan 64}. 

The improved accuracy is best illustrated by repeating the numerical 
Example 13-5 to determine the value of y at x = 0:5, given that γ΄ = xy and 
y(0) = 1. To simplify the headings on the tabulation we set m; = f(x, yi) 
and γε = χε + dx, yi + ἀνῇ and, as before, use increments Ax = 0-1. 
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eee 


i xi vi mi dy Mit At = BME He) ἀγι eft? 
eee 

0 00 1:0 0-0 0-0 0-1 0-05 0-005 1-0 

1 Of 1:0050 0:1005 0-0101 0-2030 0-1517 0-0152 1050 
2 O02 1:0202 0-2040 0-0204 0-3122 0-2581 0-0258 1-0202 
3. 03 11-0460 0-3138 0-0314 0-4310 0:3724 0-0372  1-0460 
4 04 1:0832 0-4333 0-0433 0-5633 0:4983 0-0498 3 -0833 
5. 05.»ϑ51:1330 11331 


--------- τ oe eee 


The approximate value (0-5) = 1-1330 shown in the third column is 
now only 09-0001 low, demonstrating the superiority of the modified Euler 
method over its predecessor. 


13:7. Asimple predictor—-corrector method 


Despite the improvement in accuracy brought about by the modification of 
Euler’s method, it is nevertheless quite apparent that it can only be used with 
any degree of accuracy close to the initial point. Later we shall be describing 
the Runge~Kutta method, which overcomes many of these limitations, but 
in the meantime it will be useful to give a brief outline of an alternative 
method, using a predictor and a corrector formula. 


The method we describe is perhaps the simplest of its kind but it has, 
nevertheless, an accuracy of the order of ἠδ for an integration step of length 
Ax = h. This time our approach will differ in that it will be based on direct 


y 


0 | Xo Xp + Ax x 


Fig. 13-7 Euler’s modified method. 
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integration of a differential equation of the form 
dy 
— Ξ 1(χ, y). 
ας -" 0...) 


We shall suppose that by some means, possibly by the modified Euler 
method, approximate values yo, y1, ye, and y3 of the solution )(x) are known 
at points xo, x1, x2, and x3, equally spaced with step interval 4. Now bearing 
i mind the definition of a solution y(x) as expressed by Eqn (13-2), we may 
rewrite our differential equation in the form 

ὧν 


ας SL YOO: (13-18) 


If we integrate this result over the interval [xo, x4] we obtain 


I (2) 2 =|"F (x, v(x))dx, 


eV ene | " Lx, pods, (13-19) 


or 


where, in general, x» = xo + mh, with m an integer. Thus, if the integral in 
Eqn (13-19) can be estimated using the information available to us at the 
Starting points xo to x3, then the equation can be used to predict y(x4). Since 
an error will be involved in approximate methods of integration we shall 
write yp(x4) for the predicted value of y(x4). Now in Problem 12-46 (D) we 
already have a formula for evaluating the integral in Eqn (13-19) that only 
uses the values of y’ at points x1, x2, and x3. Expressing the result in terms of 
the points xo to x4 we thus have 


4h, 
Volxa) = γίχο) + > [2y'(xs) — y'(x2) + 2p’) (13-20) 
where y (xm) signifies the value of dy/dx at x = xm. However, from the four 


given starting values we may use Eqn (13-18) to calculate approximate values 
of γ΄ (χη), so that Eqn (13-20) becomes 


4h | 
γν(χ4) = γίχο) + - [2f(x3, v3) — f(x, v2) + 27(Χ., y1)]. (13-21) 


Expressed in terms of any five consecutive points Xn-3 to Xn+1 this result, 
which is called a predictor formula, has the general form 


4h 
γρίχη 41) ΞΞ V(Xn-3) + 3 [2f(Xn, Vn) — f(Xn-1, Vn~1) 


+ 2f(Xn-2, Vn—2)]. (13-22) 
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Returning to the determination of yp(x4) by Eqn (13-21), we may now 
use this value together with Eqn (13-18) to find, as an approximation to 
y (x4), the value f[x4, yp(x4)]. Using this estimate of y’(x4) we now correct 
the estimate yp(x4) by use of Simpson’s rule. We first write down the obvious 
result 


ry ‘dy 
γαὺ — pte = |" ($4) a (13-23) 
and then use Simpson’s rule to evaluate the right-hand side in terms of the 
known values of y’(x) at the three points x2, x3, and x4. This result will then 
express the value of y(x4) in terms of known quantities and we shall take this 
value as the corrected value for y(x4). Since an error will again be involved in 
the numerical integration we shall write ye(x4)to denote the corrected estimate 
of v(x4), when Egn (13:23) becomes 


h 
Velxa) = Y(re) + τ flea, Vo(xa)] + 4f(xs, va) + f(x2, y2)}- (13-24) 


The improvement in the accuracy of γε(χ4) so determined arises from the 
fact that the error term in the predictor formula has been shown to have 
magnitude | 28 h®y')/90 |, whereas the magnitude of the error in the cor- 


d5 
rector formula has been shown to be only | A5y)/90 |. Here y) = (=) , 


with € an interior point of the interval of integration. 


Using this value ye(x4) we again use Eqn (13-18) to recalculate y’(x4), 
obtaining the corrected value f[x4, ye(xa)]. This completes the calculation 
since we now know y-(x4) and f[x4, ve(x4)], which we take as the true values 
of y(x4) and y’(x4), respectively. 

Again, expressed in terms of any three consecutive points xn—1, Xn, and 
Xn+1, result (13-24), which is called a corrector formula, has the general form 


I 
VeXn+1) = γίχη- 1) + ; [f(xn+1, Yn) + 4f(xn, Yn) + f(Xn-1, Yn-1)]. 
(13-25) 


Writing ya = y(x4), we then use the known values γι, ye, ys, and y4 at 
points x1, x2, x3, and x4, and repeat the process to determine ys = (xs). 
Thereafter, repetition of the method will advance the solution in increments 
ἢ in x as far as is desired. This manner of solution is known as Milne’s 
method. 

Although the modified Euler method can be used to obtain starting values 
for the predictor-corrector approach we shall see later that other methods 
are available that provide very high accuracy. One such is the series method of 
solution outlined in Chapter 15. 
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Example 13-11 Given that dy/dx = xy and (0) = 1, (0-2) = 1-02020, 
y(0-4) = 1-08329, (0-6) = 1-19722, let us use the predictor—-corrector method 
to compute (0-8) and )(1-0). 


n x y y 

0 0-0 100000 0Ὁ 

Ϊ 0:2 1:02020 0:20404 
2 0-4 1:08329  0-43332 
3 0-6 1:19722 =0-71833 


Here ἡ = 0:2 so that from Eqn (13-21) we have 
0-8 
Vp(X4) = 1-00000 + > (2 x 071833 — 0-43332 + 2 x 0-20404), 


giving yp(x4) = 1-37638. Our first predicted value (ν΄4)» of y’(x4) Is thus 
(y'4)p = 0-8 x 1:37638 = 1-10110. 


Using this value to calculate ye(x4) from Eqn (13-24) we have 
0-2 
yelx4) = 108329 + = [110110 + 4 x 0-71833 + 0-43332], 


giving ye(x4) = 1:37714. The corrected value of γ(χ4) is then 
(y'a)e = 0-8 x 1:37714 = 110171. 


This completes the determination of y(xa) and y’(xa), since we set γ(χ4) = 
yexa) = 1:37714 and y'(x4) = (ν΄4)ὲ = 1-10171. 

To determine y(x5) = y(1-0) we use as starting values the entries in the 
following table: 


? 


n x y y 

᾿ 0:2 102022 0-20404 
2 0-4 1:08329 0-43332 
3 0:6 1:19722  0-71833 
4. 0:8 1:37714 ~=1-1017! 


Then, as before, but now using Eqn (13-22), with » = 4, we find 
0:8 
γρί(χ5) = [Ὁ2020 + τ [2 x 1-10171 — 0-71833 + 2 x 0-43332], 


giving yp(x5) = 1-64733. This then gives (y's)p = 1:0 χ 1-64733 = 1-64733. 
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Computing y.(x5) from Eqn (13-25), with 7 = 4, we obtain 


0-2 
Ye(X5) = 1:19722 + > [1-64733 + 4 x 1-10171 + 0-71833], 


OF Ye(x5) = 1-64872. The value ()'s)c for a further integration step, should it 
be desired, is (y’s)c = 1 x 1-64872 = 1-64872. 

The correction to yp(x4) was +0-00076 and the correction to ρ(χ 5) was 
+0-00139. 

Comparison of ye(xa) and yc(xs) with the actual values obtained from 
y =e}, with x4 = 0:8 and x5 = 1-0, shows that to five places of decimals, 
the error in ye(x4) was —0-00001, whereas γε(Χ5) Was exact. 

If at the completion of an integration it is desired to change the integration 
step length from ἢ to h’, then this may be accomplished by means of inter- 
polation. Using the available tabular entries of y, an interpolation formula 
must be used to deduce appropriate functional values at four new values of x 
equally spaced with the new interval A’. Thereafter the method proceeds as 
before, using these new starting values and the step length h’. 


PROBLEMS 
Section 13-1 


13-1 Determine the order and degree of each of the following equations: 
(a) x2y” + y? + y = 0; 
(b) y’2 + 2xy = 0; 
is 9 
(c) dina! α 
(xy + 1) 
(d) (H+2)"- ἕν" + xy’ + y); 
ora Ae Ey; 
(e) γ΄ + 2y"2 Ὁ 6xy = ef, 


Section 13-2 


13-2 Determine the differential equation of the curve that has the property that 
the length of the interval of the x-axis contained between the intercepts of 
the tangent and ordinate to a general point on the curve has a constant 
value k. 


13-3 Obtain the differential equation governing the motion of a particle of mass m 
that is projected vertically upwards in a medium in which the resistance is 1 
times the square of the particle velocity v. 


13-4 Derive the differential equation which describes the rate of cooling of a body 
at a temperature T on the assumption that the rate of cooling is & times the 
excess of the body temperature above the ambient temperature Τὺ of the 
surrounding air. This is known as Newton’s Law of Cooling and it is a good 
approximation for small temperature differences. 
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Section 13-3 


13-5 Eliminate the arbitrary constants in the following expressions to determine 
the differential equations for which they are the general solutions: 


(a) 3x7 + ΞΟ; (b) y = Cx + C3; 
(c) x8 = C(x + y)?; (d) log [13 = Cy; 
(e) y = Ae* + Ber; (Ὁ y= (C+ Dx)e**. 


13-6 Determine whether the following expressions satisfy the associated differ- 
ential equations for all real x: 


(a) y=x?— 2x; xp + y= x; 
1 ᾿ Z 
ae ΡΞ ls 
(c) y= sin3x + cos3x; γ΄ + 9y =0; 
(d) y = e*(A cos 2x + Bsin 2x); γ΄ + 2γ' + 5y = 0; 
(e) y= 2x(e7 + C); xy -- y= x%e%; 
(Ὁ y= Acosx + Bsinx — 3xcosx; γ΄ +y=sinx. 


Section 13-4 


13:7 Determine whether the following differential equations have the associated 
functions as their solution over the stated intervals: 
; ἔχ + 1,x <1 i 
(a) y == ete (-l<x< 1); 
(Ὁ) γ΄ — 9y = 0, y(x) = A cosh 3x + Bsinh3x(—x « χ « &); 


: 4x+2,x< 0 
(c) y = 4,900 = | 


—2<x< —4}); 
a2 χα Ξ 0 ἰὸς ὰ 


1 
(4) γ΄ + y? = 0, px) = ΤΕΥ (—2<x <0); 


(e) x4y” + y= 0, p(x) = x sin = O<x< 1); 
(Ὁ γ΄ — 3x°y = 0, px) Ξ οὐ (-οὐ < x < 10). 


13-8 Taking intervals Ax = 0:2, use Euler’s method to determine y(1), given that 
y + y=0 and γ(0) = 1. Compare your results with the exact solution 
y = e-*, Construct the Cauchy polygon. 


13-9 Taking intervals Ax = 0-1, use Euler’s method to determine y(1), given that 
dy/dx = (x? + y)/x and (0:5) = 0-5. Compare your results with the exact 
solution y = 3x + x?. Construct the Cauchy polygon. 


13-10 Taking intervals Ax = 0-2, use Euler’s method to determine y(1), given that 
γ =y +e and γί) = 0. Compare your results with the exact solution 
y = sinh x. Construct the Cauchy polygon. 


13-11 Repeat Problem 13-10, taking Ax = 0-1 and determine the improvement in 
accuracy. 


13-12 Draw the isoclines and sketch the direction fields for the differential equations 
in Problems 13:8 to 13-10. 
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13-13 Use the isoclines and direction field for the differential equation dy/dx = 
(x* + y)/x to deduce the behaviour of the integral curves close to the origin. 
What form of singularity occurs at the origin? 


13-14 Using isoclines and the associated direction field, determine the approximate 


value of y(1), given that y’ = y + e* and y(0) = —1, Compare your result 
with the exact solution y = —cosh x. 


13:15 Use Theorem 13-1 to determine whether the following differential equations 
have a unique solution passing through all points (xo, yo) of the given regions: 


dy _ (ty 
@ 2 = [53 ( l<x<3;-10<y< 10); 


ἂν (3y+1 : 
ὦ 5: = ( 55. ) (ἀ Ξ α ΞΊ1;2 ay 9 5); 
ἂν (.» - wid aah 
(c) z= (4) (-3 <x <3; y< —4}y); 


(4) y =y+e (all finite points of the (x, y)-plane). 


Section 13-5 
13:16 Sketch the orthogonal trajectories in Example 13-10. 


13:17 Derive the differential equation describing the trajectories that are orthogonal 
to the family of curves y = ax. Show, by differentiation, that the family of 
circles x® + y? = C?, with ΟΣ a positive parameter, is the general solution 
describing the orthogonal trajectories. 


13:18 Derive the differential equations of the families of trajectories that are 
orthogonal to the following differential equations: 


(a) x?y’ — y = y*; 
(b) γ' sinx+y+1=0; 
(c) e*y’ + ετῷά + yy =0. 


Section 13-6 


13:19 Repeat Problem 13-8, using the modified Euler method and compare with 
the previous numerical solution. 


13-20 Repeat Problem 13-9, using the modified Euler method and compare with 
the previous numerical! solution. 


Section 13-7 


13-21 Use the predictor-corrector method to integrate y’ = xy as far as x = 0-3, 
taking increments ἡ = 0-05, obtaining the Starting values by means of the 
modified Euler method. Compare your results with the exact solution 
y = ef?” 


13-22 Use the predictor-corrector method to integrate y’ = y + sin x as far as 
x = 0:3, taking increments h = 0-05, obtaining the Starting values by means 
of the modified Euler method. Compare your results with the exact solution 
y = $(3e* — sin x — cos x). 


First order differential 
equations 


141 Equations with separable variables 


The class of differential equations in which the variables are separable was 
identified in Section 13-2, where it was remarked that either of the two forms 


d 

= = MXM) (14-1) 
Or 

P(x)O(y)dx + R(x)SQV)dy = 0 (14-2) 


may arise. Here, Eqn (14-2) must, of course, be interpreted in the sense of 
differentials already defined elsewhere. 

As the name implies, such equations are solved by rewriting so that 
functions of the variable x, together with its differential dx, and functions of 
the variable y, together with its differential dy, occur on opposite sides of the 
equation. The general solution may then be obtained by direct integration 
and the introduction of an arbitrary constant. 

Written in symbolic form the solutions of Eqns (14-1) and (14-2) may be 
expressed as 


ae 
NO) = [μω dx + C, (14-3) 
provided N()) is non-vanishing, and 
SY) | P(x) 
——dy = — | ——d C, 14-4 
[ον - - [πῶ δ oo 


provided Q(y) and R(x) are non-vanishing where, of course, C is an arbitrary 
constant. 


Example 141 Suppose, as in Example 13-7, that 


Solution Divide the equation by (1 — y) and multiply by the differential 
dx to obtain 
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1 (4 dx 
(=) ee 
1 —y \dx, 1+x 
From our definition of a differential we recognize that (dy/dx) dx = dy, so 
that in differential form the equation becomes 


dy ἀχ 

l—-y  : 
Henceforth, to shorten discussions, we shall proceed directly from differential 
equations to results of this form, omitting the formal introduction of the 
differentials. That is, when convenient, we shall regard dy/dx either as the 
single entity denoting a derivative or as the ratio of the two differentials dy 
and dx. 

On integrating this last result we obtain 


IS ΠΕ ΒΞ [Ξ 
——. οΟ 
δ l+x 


a ee ee ee 


where, for convenience, we write the arbitrary constant in the form log C. 
The general solution is thus |(1 — y)(1 + x)| = C where, by virtue of the 
form of this expression, C is, of course, an essentially positive constant. 
Alternatively, the modulus signs may be removed and the general solution 
written as 


ς 
ἜΤ χα 


In arriving at this solution we divided by the factor (1 — y), which it was 
assumed was non-zero. To complete the solution we must now recognize 
that if this factor vanishes then the method of solution just outlined will fail. 
Clearly, when this happens we must also enquire whether the vanishing of the 
factor itself will give rise to a solution. Now the factor (1 — y) vanishes when 
y = 1, and it is simple to substitute y = 1 into the original differential 
equation to verify that it is in fact a degenerate solution. 


y=! 


Example 142 Next let us solve the following equation, already expressed 
in differential form: 


xcosydx —e*secydy = 0. 
Solution This is of the form shown in Eqn (14-2) and, as implied by Eqn 
(14-4), after division by e~* cos y the solution may be written 


S sec? ydy = J xe*dx + Ὁ. 
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Performing the indicated integrations then gives 
tany=e%(x—1)+C or y = arctan fe*(x — 1) + C]. 


Again we must check to see if the vanishing of the divisor gives rise to 
another solution. Here the divisor was e~% cos y, which only has zeros when 
y = (2n + 1)47 with n an integer. Substitution of these values of y into the 
original differential equation shows that they are not solutions. 


14:2 Homogeneous equations 


A function F(x, y) is said to be homogeneous of degree n in the algebraic 
sense if F(Ax, ky) = k"F(x, y) for every real number k. Thus the function 
F(x, y) = ax* + bxy + cy? is homogeneous of degree 2, since F(kx, ky) = 
k2F(x, y). If a change of variable y = sx is made then any homogeneous 
function F(x, y) of degree m may be written in the form F(x, y) = F(x x 1, 
x X S) = x"F(I, 5), since here x is now cast in the role of k. This variable 
change has resulted in F(x, y) being expressed as the product of x”, which is 
only a function of x, and F(1, 5), which is only a function of 5. 
This facilitates the integration of a differential equation of the form 


M(x, y)dx + N(x, y)dy = 0, (14-5) 


in which the functions M(x, y) and M(x, y) are both homogeneous functions 
of the same degree. Such equations are called homogeneous equations. 
‘Homogeneous’ here refers to the algebraic property shared by M and N 
and is not to be confused with the same term introduced in Section 13-2. 

The change of variable implies that dy = sdx + x ds, and if M and N 
are of degree n, then M(x, y) = x"M(l, 5) and N(x, y) = x"N(I, 5). 

If, for simplification, we write P(s) = M(1,s) and Q(s) = M(l, 5). 
Eqn (14-5) becomes 


x"P(s)\dx + x"QO(s)(s dx + x ds) = 0 (14-6) 
or, on cancelling x” and rearranging, 
[P(s) + s O(s)]dx + xQ(s)ds = 0. 
This equation is of the form variables separable and has the general solution 
O(s)ds C 
oo oa Ἦν 
| Fee som πε τ} = 


where C is an arbitrary constant. 

The final solution in terms of x and y may be obtained by using the 
relation s = y/x. The vanishing of the divisor x” also implies a possible 
solution x = 0 which must be tested for validity against the original © 
differential equation. 


TT ------ a ῬῬ ὋἝὋ᾽ὉἮ''Βὃ 
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Example 143° Let us integrate the following homogeneous differential 
equation both directly and by an application of Egn (14-7), 


(y — x)dx —xdy =0. 


Solution 


Direct method: The functions multiplying dx and dy are both of degree 1, 
so the equation is homogeneous. Setting y = Sx produces the equation 


x(s — I)dx — x(sdx + x ds) = 0. 


Cancelling the factor x and rearranging we see that 


d 
fa=-[Sac 
Xx 


whence 
s=C— log|x| or y= x(C — log | x }). 


The original equation shows that the vanishing of the cancelled factor x 
gives rise to an additional solution x = 0. 


Solution by Equation (14-7): Here M(x, y) =y — x and N(x, y) = -- so 
that M(I, 5) = 5s — land N(1, s) = —1. Hence P(s) = 5 — 1 and O(s) = —1. 
Substituting in Eqn (14-7) gives 

x" 


[ as = log δἰ 


c or s= log 
Again using s = y/x we find that 


¥ = x(log ΟἹ — log | x 4), 
which agrees with the previous answer apart from the form of the arbitrary 
constant. 


A slight variant of the homogeneous equation is the so-called near- 
homogeneous equation, which has the general form 


dy _ax+by+p 


_ 
dx pare rer bc 4 0, 


and which is not homogeneous as it stands. However, by setting x = XY + a, 


y = Y + 8B, where «, 8 are constants, it follows that dy/dx = dY/d.X, and 
hence that 


ΑΥ̓ aX+bY¥+ax+ 56 +p 
dX cX¥+d¥ +cat aBt+g 


Now if the constants «, 8 are such that σα + 5B + p=Oand ca + dB + q 
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= 0, then the transformed equation becomes homogeneous in X, Y. It may 
then be solved as indicated above and the solution transformed back in terms 
of x, y by setting Y¥=x—a, Y=y-— β. 

For example, if 

dy 2x + 3y 4+ 3 

dx x+ yt2 
then «, β would be determined by solving 2« + 38 +3 =Oanda+/+42 
= 0, whence « = —3, 8 = 1, so that ¥ = x+3, Y= y-— 1. The details 
of the solution of the resulting homogeneous equation 

dY 2x+3Y 

dx X+4+Y 
are left to the reader. 

If ad — bc = 0, then the equation may be solved by setting u = ax + by 
+ p and separating the variables. 


14:3 Exact equations 


DEFINITION 14-1 (exact differential equation) A differential equation of 
the form | 


M(x, γ)άχ + N(x, γ)άν = 0 (14-8) 


with the property that M(x, y) and M(x, y) are related to a differentiable 
function F(x, y) by the equations 


as mo 


OF OF 
=e = sy 14: 
is said to be exact. 


There are various ways of solving this simple but important type of 
differential equation. Here we choose to display the relationship of its solu- 
tion to the familiar ideas of a total derivative and to the parametric repre- 
sentation of a variable y which is a function of x. Since the original differential 
equation (14-8) implies a functional relationship between x and y, we may 
suppose that the parametric representation x = x(t), y = y(t) describes the 
solution, where ¢ is some parameter. Then the differentials dx, dy, and d¢ are 
related to the derivatives dx/dt and dy/dt by the expressions dx = (dx/dd)di, 
dy = (dy/dt)dt. Using these results in Eqn (14.8), substituting for M(x, y) 


and N(x, y) from Eqn (14-9), and cancelling the differential dt, we find that 
OF dx | oF dy 
ox dt ὃν dt ( ) 


However, since the total derivative οἵ Ε[χ(), y(¢)] with respect to ¢ is 


dF @Fdx  éFdy 


G"aa ne (14-11) 
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Eqns (14-10) and (14-11) together imply that dF/dt = 0 or, equivalently, that 
F(x, y) = constant, where now the parameter t may be disregarded. Thus we 
have established that the general solution of Eqns (14-8), (14-9) is 


F(x, y) = C. (14-12) 


Now we already know that if F(x, y) is a differentiable function with 
continuous first and second order derivatives then there must everywhere be 
equality between the mixed derivatives 02F/dxéy and d°F/eydx. Applying this 
argument to Eqns (14-9) shows that the condition for a differential equation 
of type (14-8) to be exact must be that everywhere 0M/éy = aN/ex. 

We have thus proved the following theorem, by means of which we may 
test a differential equation to discover if it is exact. 


THEOREM 14-1 If the functions M(x, y) and N(x, y) and their derivatives 
OM/oy and @N/éx are continuous and, furthermore, é@M/éy = @N/éx 
everywhere, then the differential equation 


M(x, y)dx + N(x, y)dy = 0 


is exact. 


Equations (14-9) also provide the means by which the function F(x, y) 
may be obtained, since integrating the first equation with respect to x gives 


F(x, y) = J M(x, y)dx + A(y), (14-13) 


where A(}) is some function of y and acts as though it were a constant as 
regards partial differentiation with respect to x. Thus A(y) takes the place of 
the constant of integration that would occur were M to be only a function 
of x. The determination of A( y) then follows immediately by differentiating 
Eqn (14-13) partially with respect to y, and using the second of Eqns (14-9) to 
determine dA/dy, from which A(y) follows by integration in the form 


A(y) = | [ve γ) -- = | M(x, yidx [lay + C’, (14-14) 


Clearly, C’ can only be an ordinary constant of integration and not a 
function of x since, by its manner of construction, the integrand of Eqn 
(14-14) is only a function of y so that to introduce a further function of x in 
place of C’ would be inconsistent with the equation. This arbitrary constant 
C’ can be combined with the constant C appearing in Eqn (14-12) so that there 
is in fact only one constant of integration appearing in the general solution, 
as would be expected with a first order equation. 


Example 14-4 Let us solve the differential equation 
[2x + 3 cos y]dx + [2γ — 3x sin yldy = 0. 
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The equation is exact since, as M(x, y) = 2x + 3cosy and M(x, y) = 2y 
— 3x sin y, it follows that ¢M/éy = ΟΝ δχ = — 3 sin y, satisfying the con- 
ditions of Theorem 14-1. Now as @F/éx = 2x + 3 cos y, 


F(x, y) = J (2x + 3 cos y)dx + A(y) 
or 
F(x, y) = x2 + 3x cos y + 40}. 
Differentiating partially with respect to y we find that 


which, since 0F/éy = N(x, y), is equivalent to dA/dy = 2y and hence 
A(y) = y? + C’. The function F(x, y) is thus 


F(x, y) = x2 + 3xcosy + y? + C’ 
and so the general solution is 
x2 + 3xcosy + yp? =C. 


If a particular solution is required then the value of the constant C must 
be determined by requiring the integral curve to pass through a specified 
point. For example, suppose that our equation is required to have the initial 
value x = 0, y = ἐπ, the clearly C = jx? and the integral curve representing 
the solution is 


x® + 3x cos y + γὅ = jn’. 


Although not all differential equations of the form (14-8) are exact, it 
sometimes happens that they can be made exact by multiplying by some simple 
function u(x, y), called an integrating factor, which is not identically zero. 
So, if u(x, y) is an integrating factor for Eqn (14-8), then 


μία, y) M(x, y)dx + p(x, y) N(x, y)dy = 0 (14-15) 


is exact, and by virtue of Theorem 14-1 the integrating factor must satisfy 
the equation 


᾿ 
-- (uM) Ξε 5. (HN). (14-16) 


It is possible to establish that an integrating factor always exists, provided 
only that M and N are differentiable functions, but unfortunately there is no 
general method by which it may be found. 

Two special forms of differential equation in which an integrating factor 
may always be found have already been encountered in Eqns (14-2) and (14:5). 
The integrating factor for Eqn (14-2), in which the variables are separable, 
is u(x, y) = 1/[R(x)Q())] and leads to the result (14-4). Similarly, the 
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integrating factor for Eqn (14-5), with homogeneous coefficients, was 
L(x, 5) = 1/[P(s) + sQ(s)], where s = y/x and gave rise to the solution 
shown in Eqn (14-7). 

When the integrating factor is believed to be of a simple form in which 
only certain constants need to be determined, direct substitution in Eqn 
(14-16) would confirm this, and also indicate conditions by which these con- 
stants may be determined. The arguments used in this trial and error method 
will now be applied to a search for an integrating factor having the form 
μίχ, y) = x™y" in the following simple example. 


Example 14:5 Solve the differential equation 
(2xy + y?2)dx + (2x? + 3xy)dy = 0. 


Solution First we notice that 0M/dy τῇ CN/Ox, so the equation is not exact 
and an integrating factor uw is required. As M and N are simple algebraic 
functions we shall try an expression of the form p(x, y) = x™y", in which 
the constants m and n must be determined so that 

xMyn(2xy + y2)dx + x™y"(2x? + 3xy)dy = 0 | 
is exact. By condition (14-16) this implies that if u(x, y) = x™y” is in actual 
fact an integrating factor, then m and n must be chosen so that 


δ 0 
iy [xmyn(2xy + y?)] = τὰ [xmyn(2x? + 3xy)]. 


This condition gives rise to the equation 
nxmyn-l2xy + y2) + x™y"(2x + 2y) = mx™ lLyn(2x? + 3xy) 
+ x™y"(4x + 3y), 


from which we must determine m and n if the chosen form of integrating 
factor is correct. Since this expression must be an identity, we now equate 
coefficients of terms of equal degrees in x and y and, if possible, select m 
and n such that all conditions are satisfied. In this case only two conditions 
arise: 


(a) terms involving x™y"+1; 
n+2=3m-4+3; 

(b) terms involving x™+lyn: 
2n + 2 = 2m + 4. 


These conditions are satisfied if m = 0, n = 1, so an integrating factor of 
the type assumed does exist, and in this case u = y. The exact differential 
equation is thus 


(2xy? + y3)dx + (2x?y + 3xy?)dy = 0, 
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which is easily seen to have the general solution 
x®y? + χγϑ = (. 


When values of m and n cannot be found that will produce an identity 
from condition (14-16), then the integrating factor is not of the form 
w(x, y) = xmy”. 


14-4 The linear equation of first order 


The general linear equation of first order already encountered in Eqn (13-6) 
has the form 
dy 
ΤῊΣ + P(x)y = Q(x). (14-17) 
To solve it let us seek an integrating factor μ' that will make u(dy/dx) 
+ μὲν a derivative; namely the derivative (d/dx)(uy). Since the equation is 
linear, 4 must be independent of y, so we need only consider μ to be of the 
form «4 = μίχ). Thus the integrating factor μ is required to be a solution of 
the equation (d/dx)(uy) = u(dy/dx) + wPy. Expanding the left-hand side 
and simplifying then leads to the simple differential equation y[(du/dx) 
— Pu] = 0. As the solution y of Eqn (14-17) is not identically zero, it follows 
that “4 must be the solution of 


ἀμ 

-- = : 4:18 
τς POH (14-18) 
The variables x and yu are separable, giving 

OH ΒΕ Baie: 

. 


showing that log | uw | = f P(x)dx + C’, where C’ is an arbitrary constant. 
Taking exponentials we find that the most general integrating factor is 
μ Ξε ec” ΠΡ fP(adz. 

However, as the arbitrary factor e“ is always non-zero, and so may be 
cancelled when this expression is used as a multiplier in Eqn (14-17), we may 
always take as the integrating factor the expression 


p= ef Paar, (14-19) 
Multiplying Eqn (14-17) by u and using its properties then gives 


ξ (ye Paz) oes O(x) el Γααλ. 
XxX 


After a final integration and simplification we obtain the general solution 


y =e SPM (Ὁ «Ὁ F(x) SPM ax}, (14-20) 
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where C 15 the arbitrary constant of the final integration and must be retained. 

Although this general solution is useful, in applications it is usually better 
to use the fact that expression (14-19) is an integrating factor and to proceed 
directly from that point without recourse to Eqn (14-20). 

An important general point illustrated by Eqn (14-20) which, as we shall 
see later, characterizes all linear equations is that the general solution com- 
prises the sum of two parts. The first part, Ce J?@*, is the solution of the 
homogeneous equation (corresponding to Q(x) = 0) and the second part 
e~SP@42/ (x) el" Gy is particular to the form of the inhomogeneous 
term Q(x). This second part is called the particular integral, whilst the first 
part is the complementary function. Notice that the two parts of the solution 
are additive and that the arbitrary constant is associated with the comple- 
mentary function. These observations characterize linear differential 
equations of any order and will be encountered again in the next chapter. 


Example 14-6 Solve 


Ὁ 
τς Ἢ Ay = asin mx 


subject to the initial condition y = | when x = 0. 


Solution In this case P(x) =k, so that the integrating factor u = e**, 
Hence 


d 
ay (νεῖ) = ae¥* sin mx, 


giving rise to 
γεῖ =afek®sinmxdx + C 
or, 


yer (C+ af ef sin mx dx). 


Performing the indicated integration gives the general solution 


a 
= (εἴτ + —______ (k sin mx — mcos mx), 
y ar ) 
the first term being the complementary function and the second term the 
particular integral. 
To determine the constant C we now utilize the initial conditions y(0) = 1 
by writing 


am 


Ι - 6 -- ———.. 
k2 + m? 
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The particular solution is thus 


k2 - m2 + am 
ya (Hitmen 


e+ me eka 4 ee (k sin mx — mcos mx). 


If we make the identifications y = i, x = t,k = R/L, a = Vo/L, and m = a, 
we discover that we have just obtained the solution of Example 13-3 (con- 


cerning the electric circuit), subject to the initial condition that a unit current 
flows when the circuit is closed. 


Example 147 Find the general solution of the linear equation 


d 
= + xy τε 


Solution The integrating factor μὶ = εἰ ἀξ so that μ = e*”?, Hence 


7 (ye) = xe? 


or 
yer? = C+ fre”? dx. 

Since the indefinite integral on the right-hand side is οὔ ΠΣ, it follows that 
y=CeFP + I, 

The particular integral in this case is simply the constant unity. 


The Bernoulli equation 


d | 

ἘΞ + Ry = δοῦν", (14-21) 
which is nonlinear due to the inhomogeneous term γῆ, may be transformed 
into an equivalent linear equation and solved by the method of this section 
if the new dependent variable u == y!- is introduced. To see this notice that 
du/dx = [(1 — n)/y"\(dy/dx), so that substituting in Eqn (14-21) for dy/dx 
and dividing throughout by a factor y” gives the equivalent linear equation 
᾿ + (1 — nv) R(x)u = (1 -- 7) S(x). | (14-22) 
This is an equation of the form (14-17), with P(x) = (1 — 7) R(x) and 
Q(x) = (1 — n) S(x). 


_A special class of differential equations that may also be solved by means 
of the method described in this section is the class of linear second order 
equations that do not depend explicitly on the dependent variable y. They 
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have the general form 


33 ΡΟΣ = OC) (14-23) 


and may be reduced to the form of Eqn (14-17) by writing z = dy/dx. The 
solution then amounts to the determination of z from the first order 
differential equation 


= + Pole = OC) ᾿ (14-24) 


together with the subsequent integration of dy/dx = z(x) to give the general 
solution of Eqn (14-23) 


y=Jf2zxdx 4+ C. | (14-25) 


Notice that two constants of integration are involved, one arising in the 


determination of z and a second appearing as the integration constant in 
Eqn (14-25). 


Example 148 Find the general solution of 
ἀν dy 


which has the integrating factor 1/x and so 


a (=) — e2r 
dx\x/ τ᾿ 


The solution z is thus z = x(C + $e?*). Finally, since z = dy/dx, we have 
y= [χ(( + he27)dx + D 

or, 
y = fe22%(x — 4) + $Cx? + Dz 


Because C is an arbitrary constant there is no necessity to retain the factor } 
in the second term. 

Obviously, this general method of reduction of order applies to any 
differential equation not explicitly involving y, though the simplification is 
most striking in the situation just discussed. 


14-5 Equations with implicit dependence on x 


When a second order differential equation does not contain the independent 
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variable x explicitly, a solution may often be found by utilizing y in the role 
of independent variable. This is achieved by using the substitution 


dx? dx ἀχ dy ἂν 

Substituting for y’ and y” then reduces the second order differential 
equation to one of first order, in which p is the dependent variable and y the 
independent variable. The methods described previously in this chapter are 
then available for the determination of p, from which y follows by integrating 


y =p. 


Example 14-9 Solve the differential equation 
yy" = y". | 

Solution Setting p = dy/dx, substituting for γ΄ and γ΄, and cancelling p we 
obtain y(dp/dy) = p. This is an equation with separable variables and has 
the solution p = ΟἹ», where (ΟἹ is an arbitrary constant. Then, as dy/dx = p, 
it follows that dy/dx = Ciy, which is again a differential equation with 
separable variables. The solution of this equation, which is the desired general 
solution of yy” = y’%, is easily seen to be y = Cze*, where Cz is another 
arbitrary integration constant. The cancelled factor p may be disregarded 
because p = 0 only leads to the trivial solution y = constant. 


Naturally, there are many other special cases in which an appropriate 
substitution simplifies a non-linear differential equation, though unfortun- 
ately there is no general rule by which these substitutions may be found. For 
example, we now illustrate how a substitution of the form y = 2” can 
sometimes simplify a non-linear differential equation. 


Example 14:10 Solve the differential equation xy’ + 2y + x?y? =0 
utilizing the substitution y = 1/z. 


Solution It follows from y = 1/z that y’ = —(1/z2)z’, so that after sub- 
stitution and simplification, the differential equation takes the form z’ — 
(2/x)z = x. This is a linear equation with integrating factor wu = 1/x? and 
general solution z = χορ x + C). Using the fact that z = 1/y, we thus 
discover that the required solution is y = [x*(log x + C)]-1. 


14:6 Clairaut’s and Lagrange’s equations 


The name Clairaut’s equation is given to the particularly simple non-linear 
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first order differential equation having the general form 


It is usual to set p = dy/dx, so that Clairaut’s equation simplifies to 
y=xpt+f(p). (14-26) 


To solve this equation it is only necessary to differentiate with respect to 
x to obtain 


d dfd 
dp Δ ἀρ 


δε ἐὰν τὶ 


from which it then follows that 


dp ( df 
ΞΕ =. | = 0, 14-2 
qx \* 2 dp | (14-27) 

Equation (14-27) will be satisfied if either of the two factors vanishes so, 
considering first the case dp/dx = 0, we see that p = Constant. Denoting this 
constant value by C, substitution into Eqn (14-26) then gives the general 
solution 


y= Cx + ΚΟ). (14-28) 


This result has an interesting geometrical interpretation because for any 
given value of C, Eqn (14-28) represents a straight line with gradient C and 
intercept f(C) on the y-axis. Variation of C causes Eqn (14-28) to describe a 
family of straight lines. 

Defining a function F(x, y, C) by the relation F(x, y Ο ξεν -- (χ -- 
J(C) allows the family of straight lines (14-28) to be written in the form 


F(x, y, C) = 0. (14-29) 


From our previous study of envelopes we know that the envelope of these 
lines, when it exists, is determined by eliminating C between Eqn (14:29) and 
OF 0. (14-30) 


Hence, using the form of F (x, y, C), it follows that the envelope of lines is 
obtained by eliminating C between 


y=Cx+f(C) (14-31) 
and 
Onxy “" (14-32) 
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Now the form of the second factor in Eqn (14-27) and Eqn (14-32) are 
identical, showing that the special solution of Clairaut’s equation obtained by 
equating the second factor of Eqn (14-27) to zero is simply the envelope of 
the lines describing the general solution. 

By virtue of its definition, points on the envelope lie on lines representing 
the general solution, but the envelope is not itself a member of the family of 
lines comprising the general solution (14-28). For this reason the solution 
corresponding to the equations 


y=xptf(p) (14-33) 
= a (14-34) 
a : 


describing the envelope of the lines (14-28) is called a singular solution. 
Equations (14-33) and (14-34) are in fact a parametric representation of the 
envelope with the gradient p as parameter. 


Example 14-11 Determine the general solution and the envelope of the 
curves described by 

y = xp — pr. 
Solution Here f(p) = —p* and from Eqn (14-28) we see that the general 
solution is 

y= Cx — (3, 
The envelope of this family of lines is determined parametrically in terms of 
p by Eqns (14-33) and (14-34), in the form 

y = xp — p*, x = 2p. 
In this case it is possible to eliminate p between these two equations, giving 
as the explicit equation for the envelope the parabola y = :x?. 


More general than the Clairaut equation is the so-called Lagrange 
equation of the form 


y = xg(p) + f(p). (14-35) 


Its solution is again obtained by means of differentiation with respect to 
x, this time leading to the result 


d d 
p=a(p) + (x = + ΠῚ < (14-36) 


If now we suppose g(p) + p, and p is regarded as the independent variable, 
with x the dependent variable, we arrive at 


dx g'(p) f'(p) 
— + | ——*_ sy es . 
dp Fr — : ὡ Ρ-- 2(p) wee 
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where the prime denotes differentiation with respect to p. This is a linear 
differential equation which, when solved, will express x in terms of p and an 
arbitrary constant. Substitution of the result into Eqn (14-35) then also en- 
ables y to be determined as a function of p and an arbitrary constant. Here 
again the solution is obtained parametrically in terms of p. 

The condition g(p) ¥ p presents no restriction since we have already 
solved the Clairaut equation that results when g(p) = p. However, it can 
happen that the divisor g(p) — p has zeros for p = Ci, C2,. . ., Cy. Then, 
as with the Clairaut equation, singular integrals will follow directly from 
Eqn (14-35) by setting p = Cy, Co,. . ., Cn. 


Example 14-12 Solve the Lagrange equation 
Y= xp? + p?. 
Solution Differentiating with respect to x gives 


dp dp 
= p? + 2xp ~ + 2p + 
P Pe Te eas P ax 


or, 
dp 
—, — 2 --- ο΄. » 
PU P) P(x + 1) ἘΝ 


Hence, provided p + 0 and p - 1, we see that 


dx 2 2 
—=— ao vas x= ———. 
dp l—p ϊ--ρ 


This has as its integrating factor the expression (1 — p)* and as its solution 
the expression 


ogee 
ee Cee) 2 eee 
(1—p)? (1 — p)? 
Combining this result with the original differential equation shows that 


ye 

= (1 + C)———.. 

° (P= pe 

These expressions for x and y are the parametric form of the general solution 
involving an arbitrary constant C and p as the parameter. 

Returning now to the zeros p = 0 and p = 1 of the divisor P(l — p) we 
know that these will give rise to singular integrals. Setting p = 0 and p = 1, 
respectively, into the original differential equation finally gives for the 
singular integrals y = 0 and y = x + 1. 


14-7 Picard’s iterative method 


The method of solution of the initial value problem 
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d 
=f»), γ(ὺ = yo (14:38) 


which we now outline briefly is due to E. Picard. It provides an interesting 
example of an important iterative procedure and is based on the formal 
integration of Eqn (14-38). Recalling that, as in Eqn (13-18), we may rewrite 
Egn (14-38) in the form 


d 
FASO, γαὺ = Yo (14:39) 


integration of this result using the dummy variable t gives 


ose | “f(t (Ode (14-40) 


Notice that this is a simple integral equation of the form already encountered 
in Problem 14-6 at the end of this chapter. 

Picard’s approach to the solution of this equation was to begin by intro- 
ducing an approximate solution y(x) into the integral in Eqn (14-40), 
thereby enabling its evaluation. He then showed that under very reasonable 
conditions on f(x, y), the resulting right-hand side of Eqn (14-40) generated 
a better approximation y(x) to the true solution (x). Thereafter, by the 
same argument, inserting y'(x) into the integral generates a still better 
approximation y'?)(x) to y(x), which in turn may be used to generate an even 
better approximation y(x), and so forth. 

Starting from y'(x) this process thus generates a sequence of approxi- 
mations y'Y)(x), p(x), . . ., p(x), . . . to the solution y(x) of Eqn (14-38) 
by means of the iterative process defined by 


yer) = yo + [ S(t, y)dt. (14-41) 
«10 

Picard established that the functions y'™(x), called the iterates of this 
process, converge to the function y(x) which is the solution of Eqn (14-38). 
Any starting approximation y(x) may be used, though naturally the con- 
vergence of the method is more rapid if a reasonable approximation to the 
solution is taken. 

When nothing is known about the solution, as is indeed usually the case, 
it is customary to take as the starting iterate y(x) the constant function 
yx) = yo, since this at least satisfies the initial conditions of Eqn (14-38). 


Example 1413. We shall first illustrate Picard’s iterative process by using 
the simple differential equation 


dy 
-ς Ξ» 700) = 1, 
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which was frequently used for numerical work in Chapter 13. Here the 
function f(x, y) = xy and yo = y(0) = 1, so that we shall take as the starting 
iterate y(x) the function p(x) = 1. Then by Eqn (14-41), with n = 0, we 
have 


Vix est +| t. id 
0 
giving 
x2 


yO(x) = 1+ 5 


Again using Eqn (14-41), but this time setting n = 1, we have 
yxy) = 1+ Ϊ ἐγ (4 
0 
or, 


x 3 
yx) = 1 + [ + 5) a 
0 2 


showing that 


x2 χά 
Qy—j[+—4—. 
y)(x) + + - 


Continuing this process generates for the third and fourth iterates 
y'3(x) and y)(x) the expressions 


x= xt. xé 
(3)(χὴ -Ξ ΤῊΣ τῆς ἐν ἐὰν πον 
ιν 
and 
y(t) aD alee a 
@al+> rete 


The differential equation has separable variables and is easily integrated 
to give the exact solution y = εἴ 2, Expressing e”’? in series form shows 
that in this case the nth iterate generated by the process simply comprises the 
first (n + 1) terms of the series for οὖς, 

In the above example the iterates γί) (χ) were easy to compute and they 
converged very rapidly to the exact solution. This is not always the case, 
since often the integrations become extremely involved, making the deter- 
mination of more than the first two or three iterates all but impossible. 
Naturally, an unfortunate choice of the starting iterate y(x) might also 
complicate the ensuing calculations. It is for these reasons that generally 
Picard’s method is not suitable for obtaining numerical solutions, though 
when appropriate it is extremely useful. The Picard iterates for Example 14-13 
are compared graphically with the exact solution in Fig. 14-1. 


644 / FIRST ORDER DIFFERENTIAL EQUATIONS CH 14 


Fig. 141 Picard iterates for y’ = xy, y(0) = 1. 


Example 14-14 A less trivial example is provided by the solution of the | 
non-linear differential equation 


subject to the initial conditions y(1) = 2. 
Setting = 0 in Eqn (14-41) and applying the result to the differential 


equation, we obtain for the start of our iterative scheme 


(D(x) == 2 + [te + (yO)? Id ¢, 


since the initial conditions y(1) = 2 imply that xo = 1 and yo = 2. 
Taking as the starting iterate the function y((x) = 2, we find that 


x 
W(x) = 2 + [ ((2 + 4)αι, 
1 
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giving rise to 


7 3 
OQ) = 5+ 4x +5. 
To proceed to the next iterate y'*’(x) is straight-forward, and using 
y(x) in conjunction with Eqn (14-41), in which we set n = 1, gives 


: 7 (Ἃ 3 
wrens [+ (F943) 
1 


After simplification this becomes 


39 49 28 17 7 8 Ι 
τ. πῶς ΞΟ δ σαι νοὸς δι ΟἿ 
Ὁ τ δ ὦ i | a ΝΣ 
It is clear that although the integrations that result here will always be pos- 
sible, the rapid increase in algebraic complexity will make the computation of 
further iterates very tedious. 


The practical difficulty so often encountered when Picard’s iterative 
method is used as a means of obtaining numerical solutions becomes more 
apparent when an initial value problem of the following kind is considered: 


y =x+siny γίθ) = ἐπ. 


Taking for the starting iterate y(x) = ἐπ there is no difficulty in 
obtaining for the first iterate y“(x) the expression 


yx) = ἐπ + x + 1χ3. 


However, when an attempt is made to compute y‘'?)(x), we find that 


Seen ae Ϊ κι: sin (da + t+ μ5}1α., 


in which the trigonometric integral involved is no longer expressible in terms 
of elementary functions. 

Although we shall not discuss it here, the real importance of Picard’s 
method is not as a numerical technique but as a powerful theoretical tool by 
which conditions for the existence of solutions may be examined in very 
general terms. 


14:8 Direct deductions and comparison theorems 


So far we have discussed a variety of special methods for the exact solution 
of first order differential equations together with several possible numerical 
techniques of varying applicability and accuracy. We close this chapter by 
indicating how these techniques may often be usefully supplemented by a 
direct examination of the differential equation itself and by the exploitation 
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of a simple idea deriving directly from the geometrical notion of an integral 
curve. 

On occasions it is helpful to deduce properties of solutions of a differential 
equation directly from the equation itself without first obtaining the general 
solution. Indeed, in many practical situations the general solution is not 
known, as is the case with most non-linear equations. In these circumstances 
any property of the solutions that may be deduced directly is likely to be of 
considerable value. 

This idea is best illustrated by a simple example and for this purpose we 
shall use Example 13-1. The equation in question 15 


a g peep pes vy, (14-42) 


in which v denotes the velocity of a falling particle of mass m at time f ina 
resisting medium with a quadratic velocity resistance law having propor- 
tionality constant A. 

Suppose it is required to find the limiting constant velocity of fall c of the 
particle, often called the terminal velocity. Then we know that under these 
conditions the acceleration dv/dt must be identically zero and hence Eqn 
(14-42) reduces to 


A 
0 -Ξ ρ -- -- οὗ 
§ ae c"; 
giving 
c = (mg/A)U?. (14-43) 


This result has been obtained directly without integration of Eqn (14-42) 
and is typical of many situations in which an examination of the differential 
equation with the physical problem clearly in mind will yield direct and useful 
information. Result (14-43) may of course be obtained from the solution of 
Eqn (14-42), but with far greater effort. If, for example, the case of a particle 
falling from rest at time f = 0 were to be considered then separation of the 
variables and integrating would give for the solution 


Act 
cceeaen μὴ (14-44) 
m 


Thus, as lim tanh x = 1, it follows directly from Eqn (14-44) that limv = c. 
t— 0 


i ee) 
It would be instructive for the reader to derive solution (14-44) by integration 
and then to compare the magnitude of the task with the simple argument that 
gave rise to Eqn (14-43). | 
Useful though this direct approach may often be, still more valuable is 
the notion of a comparison theorem that we now introduce. The idea is 
simple and relies on the geometrical concept of an integral curve. We now 
state the following: 
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THEOREM 142 If the two differential equations 


dz 


— = f(x, y) and Pa G(x, Z) 


have the same initial conditions y = z = yo at x = xo and, furthermore, 


7, y) < G(x, y), 
then y < z for x > xo. 


Proof This result follows from the fact that because of the condition 
T(x, y) < G(x, y) and the common initial conditions, the gradient of the 
z-integral curve through point (xo, yo) is always greater than that of the 
y-integral curve and so the z-curve will always lie above the y-curve. 

An obvious extension of this reasoning produces the more general result: 


THEOREM 14-3 If the three differential equations 


= G(x, 2) 


dy d 
τ; 8 w), Ae: Ag 


have the same initial conditions w = y = z = yo at x = xo and, furthermore, 


B(x, y) < f(x, y) < GC, y), 
then w < y <z for x > Xo. 


Theorems 14:2 and 14-3 are special cases of comparison theorems in 
which the solution of a complicated differential equation is compared with 
the solution of a related and usually simpler differential equation that may 
be integrated exactly. By means of such theorems useful information may 
often be obtained about the solution of a differential equation for which the 
general solution is unknown. 

The situation in Theorem 14-3 is illustrated in Fig. 14-2, where the 
solution y is represented by the full line and the comparison solutions w and z 
are represented, respectively, by the lower and upper dotted lines. 

To show how these arguments may be applied we shall again consider 
the non-linear differential equation 


subject to the initial condition y(1) = 2, used in Example 14-14 in connection 
with the calculation of the Picard iterates. 

Since x? + y? is essentially positive, it follows that the integral curve for 
y is a monotonic increasing function of x so that by assigning to } its value 2 
at the initial point we must have 


xe+4<x%7+ y2 forx>1. 
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Fig. 14:2 Comparison solutions. 


Thus if in Theorem 14-3 we make the identification f(x, y) = x? + y?, 
then for the functions g(x, y) we may choose g(x, y) = x® + 4. i 

To obtain a suitable function G(x, y) it is now necessary to specify the | 
range of values of y that will interest us. Let us suppose that we wish to 
estimate y as a function of x in the interval 2 < y < 3. Then for y in this range 
we must obviously have the inequality 


x2 pRcx?+9 


so that for the function G(x, y) we may take G(x, y) = x? + 9. 

Applying the conditions of Theorem 14-3 we then see that the desired 
solution y lies between the comparison solutions w and z of the comparison 
equations 


with w=z=2atx=1. 
These comparison equations are easily integrated to give 


x 7 x3 22 
=—+4x-—-- πὸ — + Ox — --ὄ 
a x— 3 and z=5 + 9x 3 
We thus know that for the interval 2 << y < 3 we must have 
χϑ 7 x 22, | 
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We may easily check this result by assuming that the second Picard 
iterate yx) computed in Example 14-14 provides a good approximation to 
the true solution at some value of x close to x = 1. If we choose x = 1°1, 
then y')(1-1) = 2:6174, whilst the above inequalities becomes 2:5104 < 
y(1-1) < 3-0104, thereby showing that the upper and lower bounds on 
y(1-1) are reasonably close. 

To assess the value of comparison solutions it is necessary to obtain both 
upper and lower bounds for the desired solution and then to examine the 
rate at which these diverge as x increases. 

The comparison solution w is less than y for all x > 2 and, if desired, it 
may be used to place a lower bound on y for any value of x greater than 2. 
However, it is clear that the error will grow rapidly, and even when x = 1-2, 
taking the second Picard iterate y‘?)(x) as the exact value, we obtain y'?)(1-2) 
= 3-5152, whereas w(1-2) = 3-0427. In these circumstances, although the 
inequality w(1-2) < y(1-2) is still true, it is clear that to take w(1-2) as an 
estimate of y(1-2) is likely to prove unsatisfactory. 

To extend similarly the estimate z of the solution y it is of course necessary 
to take a larger upper limit for y in the inequality 2 < y < 3, though if this 
is done the upper estimate z(x) will move further from w(x), making the result 
less precise. 

However, rather than describe various devices that may be used to over- 
come these difficulties in special cases, we shall present a different form of 
comparison Theorem 14-3 that is sometimes of value. 


THEOREM 14-4 If y is a solution of the differential equation 
dy 
dx 


where 0 « m < r(x) < M, then w < y < z, where w and z are solutions of 
the equations 


= A(x, y) + r(x), 


d d 
a = A(x, νὴ +m and Ξ = h(x, 2 ἘΜ 
ἀχ dx 


with w = y= Z= yo at x = Xo. 


Proof This follows directly from Theorem 14-3 by identifying the function 
F(x, y) with h(x, y) + r(x), when it is at once obvious that 
h(x, y) +m < f(x, y) « h@, y) + M. 


The functions g(x, y) and G(x, y) then become h(x, y) + mand A(x, y) + M, 
respectively. 
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We may apply Theorem 14-4 to the previous example 


a ee 
dx” ΤῊ 


with y(1) = 2 by simply making the identifications 
h(x, y) = y’, r(x) = x’, 
Then if we require a solution in the x-interval 1 < x < 2, we have 
hay? = Xo yt oa ys 


so that for 1 < x < 2, p(X) is intermediate between the solutions w and z of 
the comparison equations 


—=1+w? and —=4+42? 
dx ε dx 
with w = z = 2 when x = lI. 
On integration of these simple comparison equations we obtain 


w = tan (x + 0-107) 
z = 2 tan 2(x — 0-608). 


Setting x = 1-1 to compare these numerical bounds with those obtained 
by using Theorem 14-3 we find that w(1:1) = 2:6264 and 2(1-1) = 3-0078. 
Thus we now know that 2:6264 < (1-1) < 3-0078 which, as it happens, is a 
much closer estimate than was previously found. 

It is interesting to notice that as w(1-1) is a strict underestimate of (1-1), 
the approximate value y(1-1) = 2-6174 that was computed from the second 
Picard iterate y'?)(x) is too low. Further iterates are clearly necessary to 
obtain an estimate of y(1-1) by the Picard method that is even accurate to 
two figures. 


PROBLEMS 


Section 14-1 
141 Find the general solution of the following problems by separating the 

variables: 
(a) tan x sin? y dx + cos* x cot ydy = 0; 
(b) xy’ -- y=; 
(c) xyy’ = 1 — x?; 
(d) y— xy’ = a(l + x*y’); 
(e) 365 tan y dx + (1 — 65) sec? γ ἀν = 0; 
(Ὁ γ' tan x = γ. 


Section 14-2 


14:2 Find the solutions of the following homogeneous and near-homogeneous 
equations: 
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(a) (x — γὴν dx — x? dy = 0 if initially x = 1, y = 2; 
(b) (2x + 4y — 3)άχ — (x + 2y + I)dy = 0; 

(c) (x? + y*)dx — 2xy dy = 0; 

(d) (x? — 3y*)dx + 2xy dy = 0; 


ane» 
(e) ρ + x sinh? * ax — xdy = 0. 


Section 14-3 


14-3 Show that the stated functions y are integrating factors for their associated 
differential equations and find the general solutions: 


2 3 2 3 


2x 
(0) oe. ΧΟ (; + cos xy 


(c) (ye ΡΣ + e¥/?)dx + 2x οοβῇ ἐν ἀν (u = et), 


Jax =0 (μ =1 + cosxy); 


14-4 Determine if μι = x”y” is an integrating factor for the following differential 
equations and, if so, deduce appropriate values for m and n. Use your results 
to determine the solution in each case: 


(a) Gy? + 4xy? + 3y)dx + (6xy + 2x2y + x)dy = 0; 
(b) y(2x + cosh x)dx + 2(χ3 + sinh x)dy = 0; 
(c) (x + y)dx + x(x + 3y)dy = 0; 

δὰ 
(4) 2ev¥dx + (er + Vay = 0 if initially x = 1, y = 0; 
(e) y2dx + (x + cos y)dy = 0; 
(Ὁ γῶ + 3xy")dx + 2x(1 + 2xy2)dy = 0. 


Section 14:4 


14:5 Find the general solutions of the following differential equations, and when 
initial conditions are given determine the particular solution appropriate to 
them: 


Re ae τ 
(a) dx x 2X; 


a ee ee ee ae 
(0). lee eta (y = 0, x = 1); 
dy 2 =; a 


(ὦ x’ +y—-e=0 (¥=1,x=2); 
(e) γ΄ —ytanx=secx (y=1,x=0); 
(Ὁ ydx — [ἰν5 + (| + 1)χ]άν = 0; 

(g) γ΄ + xy = xy? 


(h) y’ + τ = xy" sin x. 
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14:6 


14-7 


14:8 


14:9 


14:10 


14-11 


14-12 


Equations of the form 


y(x) + k [ ᾿ y(t)dt = h(x), 
0 


in which & is a constant and A(x) is a known function of x are simple examples 
of integral equations. These are so called because they involve the unknown 
function y together with its integral. They may be converted to a linear first 
order differential equation, and hence solved, by differentiation with respect 
to x. Use this method to solve the following integral equation: 


y(x) + { ᾿ y(t)dt = e* with γ(0) = 1. 
0 


Use the method outlined in Problem 14-6 to solve the slightly more general 
integral equation 


x 
y(x) - [ y(t) tan ¢d¢ = 1 + sin x with y(0) = 1. 
0 
Show that the solution of the homogeneous form of the differential equation 


dy = 

+ ΡΟὺν = Q(x) 
1S 

y= Aes P(z)dz, 


If A is now regarded as a function of x show, by substituting back into the 
inhomogeneous equation, that A(x) is the solution of the equation 


dA 

ain jP(a)da 

τ τος ᾿ 
in which the variables are separable. Integrate this result and find the general 
solution of the original differential equation. 


By applying the previous method to the differential equation 
dx x 


show that A(x) = C + e*(x — 1) and hence find the solution that satisfies 
the initial condition y = 1 when x = 1. 


Find the general solution of 


which is such that y = 1 and dy/dx = 0 when x = 0. 


Suggest a homogeneous fourth order linear differential equation that can be 
reduced to the solution of a first order equation, and find the general solution. 
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Section 14-5 
14:13 Solve the following differential equations: 
(a) γ΄ = y?; 
2 ΠΡ 
Ἐν" 
(ὦ γ΄ = γ΄ cos y, given that y = ἐπ, γ΄ = latx=1; 
(e) yy” = γι ΟἿ — 1). 
14-14 Use the indicated substitutions to solve the following differential equations: 


dy Ξ οἷ]. 
() = τ νῷ -- πὸ (y=3): 


(b) 2xy 2 +y=et (y= ΩΣ 
(c) 2y’ — 3y + (x? +1)y3=0 (y Ξ στ "2. 
14-15 Non-linear differential equations of the form 


= + a(x) + b(x)y + c(x)y? = 0, 


where a(x), b(x), and c(x) are continuous functions of x are called equations 
of the Riccati type. 

If yi(x) is a particular integral of this equation show that the change of 
variable y = γι + z leads to the differential equation 


d 
τ- + (B(x) + 2c(x)yi)z + c(x)z? = 0. 

By means of the further change of variable w = 1/z show that w satisfies 
the linear differential equation 


- — (b(x) + 2c(x)y1)w = e(x). 


14:16 Use the results of Problem 14-15 together with the fact that yi(x) = tan x 
is a particular integral to solve 


Section 14-6 


14-17 Obtain the general solutions and, where possible, the envelopes of the 
following Clairaut-type equations in which p = dy/dx: 


(a) y = xp + 2p’; 

1 
Ὁ) y=xpt-; 
(b) y = xp P 


(Ὁ) y= xp — p. 
14:18 Obtain the general solutions of the following Lagrange-type equations in 
which p = dy/dx: 
(a) y= (1 + p)x — dp’; 
(b) y = dxp + p*. 
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‘Section 14-7 


14:19 Use the Picard iterative method to obtain the first three iterates in the 
following initial value problems by using the starting iterates indicated in 
each case. Evaluate all integrals whenever possible. 


(a) oy -ΞΞ- γ-- sin x with y(0) == |, yOX) = 1; 
(c) = = sin y with y(0) = ἐπ, y(x) = ἐπ; 
dy i 0) = 
(ὦ τ "- xy with 0) = 1, yx) = cos x. 
Section 14-8 


14-20 A particle moves in a resisting medium with velocity v at time ¢ governed by 
the differential equation 


dv 

—=k— Ail e~*)v%, 

rr (1 + e~‘) 
with k, A, and « positive constants. Deduce the terminal velocity c of the 
particle. 


14-21 The differential equation 


= =ay—siny, 

in which « is a parameter, y is a displacement, and ¢ is the time describes the 
response of part of a non-linear contro] system. Obtain the conditions on the 
parameter « in order that there should only be four steady state modes of 
operation characterized by non-zero values of y. Use graphical methods to 
determine the approximate values assumed by y in these steady state modes 
of operation. 


14:22 The reaction rate m in a chemical engineering process is governed by the 
differential equation 


dm 

— =at bm+ km, 

al + + 
where a, ὁ, and k are constants. Find the possible steady state reaction rates 
and deduce the conditions on the constants in order that these rates should 
be positive. 


14-23 Apply Theorem 14-3 to the following initial value problems and determine 
the comparison solutions w and z for the specified ranges of x: 
(a) ἂν sinx+y 


de 1 ean y(0) = 1, for x > 0; 


(b) 2 =x+ye¥%?, γ(1) = 0, ἴογ 1 < y < 3. 
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(c) a = yew? — x, γί0) = —1, for —1 > y > —3; 


dy 1 + 2x?\ . a 
(d) ri (7 9 sinx, y(0) = 1, forx > 0. 


14:24 Apply Theorem 14-4 to the following two initial value problems and deter- 
mine the comparison solutions w and z: 
1 + 4x2 


dy _ = 
8) ies a T+ x? γ(0) = 1, for x > 0. 


x? tanh x 


dy _ x* tanh x = 
Oa a Ὸ lax” y(2) = 1, for2<x<5. 


Higher order differential 
equations 


15:1. Linear equations with constant coefficients— 
homogeneous case 


Thus far our encounter with differential equations of order greater than unity 
has been essentially confined to the second order equation which served to 
introduce the sine and cosine functions in Chapter 6. We now take up the 
solution of higher order differential equations in more general terms and, 
amongst other matters, extend systematically the notion of a complementary 
function and a particular integral first introduced in connection with a 
linear first order differential equation in Chapter 14. 
Differential equations of the form 


dy αἰ 
ἄχη ἡ de) +t + any = f(x), (15-1) 
in which aj, a2,. . ., Gn are constants (usually real), are called /inear constant 


coefficient equations. We begin by studying the homogeneous form of the 
equation (that is, f(x) = 0) which in this context is usually called the reduced 
equation: 


dz dz-l 
“δ A 


dx” dx7-1 


+++ +any =0. (15:2) 


Equation (15-2) may be solved by using the fact that y = Ce** is obviously 
a solution for arbitrary constant C, provided A satisfies the equation that 
results when this expression is substituted into Eqn (15-2) giving 


(An + ατληπ-ὶ + + + + ane = 0. (15:3) 

The substitution y = Ce*” has thus associated a characteristic polynomial 
in A, 

P(A) = A" + ayh™ 1 Ἔ τ + + + Gn, (15-4) 


with the differential Eqn (15-1) and, as εἶς - 0, it follows that the permissible 
values of A in the solutions y =: Ce*” must be the roots of P(A) = 0. The 
values of A are thus determined by solving the characteristic equation 


P(A) = 2% + αιλλ 1 4 +--+ an = Ὁ. (15-5) 


This equation is also known as the indicial or auxiliary equation. 
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Since P(A) is a polynomial of degree n, it follows that P(A) = 0 has pre- 
cisely n roots A = A; and, furthermore, we know that when the coefficients 
a, Q2,.. ., Qn are real; these roots must therefore either be real or must 
occur in complex conjugate pairs. Each expression γὲ = Cie**, i = 1, 2, 
..., ἢ 15. a Solution of Eqn (15-2) for arbitrary Ci, and direct substitution 
into Eqn (15-2) verifies that 


y = Cre + Coe +--+ + Cae’ (15-6) 


is also a solution. The additive property of solutions of higher order linear 
differential equations expressed by this result is often referred to as the 
linear superposition of solutions. 

This solution of the reduced Eqn (15-2) is called the complementary 
function and if the A; are distinct, the n individual solutions y; are linearly 
independent. We shall now prove this assertion of linear independence 
indirectly, by assuming the result to be invalid and producing a contradiction, 
thereby showing that the assumption of linear dependence is incorrect. This 
contradiction proves the result. 

Consider the case in which the roots A; are all distinct, but assume that 
the y; are linearly dependent. Then there must exist some constants Ci, 
Co,..., Cn, not all zero, such that 


Cre!™ + Coe?® ++ +--+ Cre” = 0. (15:7) 


Successive differentiation of expression Eqn (15-7) shows that also 


Cidie™* + Codoe’®* +--+ + + Crdne’n™ = 0, 
(15:8) 

CyAy"-1e4* ἯΙ (υλοπ-ιρλος oe ee Crdg?—te2n* — ἢ 
Now if Eqns (15-7) and (15-8) are to be true for a non-trivial set of constants 


Ci, the determinant | W | of the coefficients of the C; must vanish for all x, 
thereby giving rise to the condition 


eit e/2t ἜΝ eink 
Aie*1* Ase?2” a4. ἃ Ane” 
|\W\= = 0. (15-9) 
Aymnlet® Jom-let2™ . «A n-le4nt 


The determinant | W | formed in this manner is known as the Wronskian 
of the 7 solutions γι, and plays an important role in more general studies of 
differential equations. In this case | W | has a simple form and, as the common 
exponential factor in each column is non-zero, it follows that these may be 
removed as factors of | W|, showing that condition Eqn (15-9) implies the 
vanishing of an alternant determinant | A |, where 
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2) ee l erie κα Ι 


la, a 50 (15-10) 


Ayr-1 Agt1 pa τ λ,} 1 


Now we know from the theory of determinants that the value of | A | is 
simply the product of all possible factors of the form (A; — /;) with a suitable 
sign appended, so that if the roots A; are all distinct | A | cannot vanish. This 
contradiction thus proves that Eqn (15-7) can be true only if all the Οἱ are 
zero, and hence the ἢ solutions y; must be linearly independent. We have 
thus proved: 


THEOREM 15:1 (linear independence of solutions—single roots) A differ- 
ential equation 


y™) ΜΝ ay +e ey any = 0 


which has v distinct roots 4; of its characteristic equation P(A) = 0 has n 
linearly independent solutions γιὲ = Cie“. Its general solution, the com- 
plementary function, is of the form 


y = Cye* + Coe? + + > + Cre**”, 


Example 15-1 Suppose that γ΄ + 3y’ + 2y = 0, then P(A) = A2 + 32 + 2 
and the roots of P(A) = 0 are A = —1, A = —2. The linearly independent 
solutions are y) = Cye~* and ye == (96. 27, and the general solution or com- 
plementary function is y = Cye-* + Cee~2*. A simple calculation shows that 
the Wronskian | W| = —e~82, 


When r of the roots of the characteristic polynomial P(A) coincide and 
equal A*, say, then A = Δ" is said to be a root of multiplicity r. The form of 
the general solution Eqn (15-6) is then inapplicable because r of its terms are 
linearly dependent. In this situation an additional (r — 1) linearly independent 
solutions need to be determined to complete the general solution. 


THEOREM [5:2 (linear independence of solutions—repeated roots) When 
A = A, is a root of multiplicity r of the characteristic equation P(A) = 0 
belonging to 

y™ + ayy") ΟΕ any = 0. 


then e*!*, xe", x2e"17, . . ., χΥΓΊρλΙΣ are linearly independent solutions of 
the differential equation corresponding to the r-fold root λι. 


Proof Because the stated form of the linearly independent solutions may be 
established more easily by a different technique, which we shall discuss later, 
we only prove the result for a root of multiplicity 2 (a double root). The 
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assertion of linear independence, however, will be proved for 2 Of r. 


When Δ = A; is a root of multiplicity 2 the characteristic ~ tal 
P(A) may be written | Be 


P(A) =A" + adr te and ἀμ = (A — A)? 0(A), (15-11) 


where Q(A) is a polynomial of degree (n — 2). Clearly, by definition, P(A1) 
= 0, and by differentiation of P(A) = 0 with respect to Δ it also follows that 


nAy™-l + (n — DayAi@—-2 Ἢ + + + Qn-1 = 0, (15-12) 


which is a result that will be needed shortly. 

If now we write γ = xe!*, then γι) = rAyr-le*!* + Ayrxe” and, as 
P(A1) = 0, substitution into γι) + ayy") +++ ++ any gives (nA?) + 
(2 — l)ayAy"-2 + > + + + ap-s)e"”. This vanishes by virtue of Eqn (15-12), 
so we have shown that y = xe” is actually a solution of differential Eqn 
(15:2) when 4 = λὶ is a double root. As y = e“” is obviously also a solution, 


‘we have established that when A = A, is a double root, the general solution 


must take the form 
y = Cre™™ + Coxe™™ + Cye® εν 4 Cae”, (15-13) 


where the remaining (7 — 2) roots Ag, A4,. . ., An are assumed to be distinct. 

Whether or not the A; are multiple roots, n initial conditions must be 
specified in order to construct a particular solution from the appropriate 
form of the general solution. Used in conjunction with the general solution 
they enable n simultaneous equations to be formed for the determination of 
the ἡ arbitrary constants Ci, Ce,. . ., Cn. The usual initial conditions for an 
nth order differential equation are the specification of the values of y, y™), 
y,..., yl) at some initial point x = xo. 

Now in connection with the Wronskian, we have already seen that a set 
of exponential functions is linearly independent provided the exponents are 
distinct. Hence, to show that the functions in Eqn (15-13) are linearly inde- 
pendent, it will be sufficient to show that e* and xe’! are linearly inde- 
pendent. This result is self evident, because removal of the common factor 
e"!* leaves the functions 1 and x, which are obviously linearly independent. 

For completeness, and for application to roots of multiplicity greater than 
2, we prove the more general result that the functions 1, x, x?2,.. ., Χ are 
linearly independent. 

Assuming first that this is not true and that these functions are linearly 
dependent, it follows that there must exist a non-trivial set of constants Co, 
Ci,.. ., Cm Such that, for all x, 


Co + Cix + Cox? - τ. 4+ Cyx™ = 0. 


However, we know that as this 15 an algebraic equation of degree m, there can 
at most be only m distinct values of x for which it can be true. The expression 
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cannot thus be true for all x, and so the assumption of linear dependence is 
false. This establishes the result. 


Example 152 Suppose γ΄ + 4y" + S5y’+2y=0, then P(A) =A? + 
472 + 5A +2 = (A+ 1)2%(A + 2) and the roots of P(A) =O are A= —1, 
A = —1,and 4 = —2. The root A = —1 has multiplicity 2 and so is a double 
root. The linearly independent solutions corresponding to 4 = —1 are 
yi = Cie~* and ye = Coxe, the remaining linearly independent solution 
being y3 = C3e~2*. The general solution is y = Cye~* + Coxe-* + (86. 35, 
To determine the particular solution appropriate to, say, the initial conditions 
y=1, γ΄ =0, γ΄ = 1 when x = 0 we proceed as follows. From the general 
solution we find by differentiation that γ΄ = —Cye"* + Co(1 — x)e? — 
2C3e-2% and y” = Cye"* — (γ(2 — xe? + 4Cze-2*. Substituting the initial 
conditions gives rise to the three simultaneous equations 1 = Ci + Cs, 
0 = —C; + Ce — 208, and | == Οἱ — 2C2 + 40 83, which have as their 
solution ΟἹ = —1, Ce = 3, ὦ = 2. Hence the required particular solution 
is y = (3x — I)e* + 2e-%2, 


Example 15:3 Suppose y® + 3y — y) — 7y) 4 ἄν = 0, then P(A) = 

AS + 344 — 43 — 7A 4+ 4 = (A — 1)2(A + 2)°(A + 1) and the roots of P(A) 
= QOaredA=1,A=1,A = —2,4 = —2, and 4 = —1. The roots 4 = | and 
A = —2 are double roots and the root A = —1 is a single root. The general 
solution is yp = (τοῦ + Coxe? + (3625 4+ Caxe®* + (66. 5, 


Example 15-4 Suppose y) — 3y' +4 3y) — γ(2) = 0, then P(A) = A> — 
3A4 + 3243 — 22 = A2(A — 1)3, and the roots of P(A) = 0 are A = 0, A = 0, 
A=1, A=1, and A= 1. The root A = 0 is a double root and the root 

= 1 is a triple root. The general solution 1s y = Ci + Cox + Cge* + 
Caxe” + Cs5x2e*. Here the terms Οἱ and Cex are the linearly independent 
solutions corresponding to the double root A = 0. 

Finally we must give consideration to the situation in which the roots A; 
occur in complex conjugate pairs. Suppose that As = + iv and its complex 
conjugate A; = μ — iv are roots of the characteristic polynomial P(A) = 0 
of Eqn (15-2). Then, by analogy with the case of real roots, ys = exp [(u + 
iv)x] and y.* = exp [(u — iv)x] must be solutions of Eqn (15-2). Linear 
combinations of ys and ys* will also be solutions of Eqn (15:2) and so, in 
particular, u = 4(ys + ys*) and v = (1/2i)(ys — ys*) will be solutions. A 
simple calculation then shows that u=e“*cos vx and v =e sin vx. 
Hence the combination of terms corresponding to the complex root A; and 
its complex conjugate ἃς in the general solution gives rise to the solution 


e"*(A cos vx + Bsin yx), 


where A and Bare arbitrary real constants. We have established the following 
general result: 
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THEOREM 15:3 (form of solution—complex roots) When A = μ + iy and 
its complex conjugate A are single roots of the characteristic equation 
P(A) = 0 of the differential equation 


ye + ayy") Ἔ:::.-Ὦ any = 0, 


and the remaining roots 8, Aa, . . ., An are real and distinct, the general 
solution has the form 


aoe οι cos vx + Co sin yx) + C3e73" + (ει: Soo Cre’. 


The extension of Theorem 15-3 when other complex conjugate pairs of 
roots occur in the characteristic polynomial, or when there are multiple real 
roots, 15 obvious and immediate. When a complex root has multiplicity 
greater than 1, the results of Theorem 15-2 may be incorporated into Theorem 
15:3 to modify the constants C, and Cs. Thus, for example, if complex root 
As had multiplicity 2, the general solution stated in Theorem 15-3 would 
take the form 


y = e“*{(Cy + Cox) cos vx + (C3 + Ca4x) sin vx} + Cse’* 
+ Cee’e* ahr Aaa Ως Cre’, 


Example 15-5 The differential equation y” + 4y’+ 13y =0 has the 
characteristic polynomial P(A) = A? + 4A 4+ 13 = (A 4+ 2 + 3)(0 + 2 — 33) 
and the roots of P(A) = 0 are A = —2 — 3i and A = —2 + 3i. The general 
solution is y = e~#%(C cos 3x + Ce sin 3x). 


Example 15-6 The differential equation y® + 3y + 10γ(3) + 6y!2) + 
Sy) — 25y = 0 has the characteristic polynomial P(A) = A> + 344 + 108 
+ 6A2 + 5A — 25 = (A — Ἰγ 4 1 4+ 21)%(4 + 1 — 27)2. The complex roots 
A= --Ι — 2iand A= —1 + 2i of P(A) = 0 are double roots, and the single 
root A = | is the only real root. The general solution is y = e-7[(C1 + C2x) 
cos 2x + (C3 + C4x) sin 2x] + Cre”. 


15:2 Linear equations with constant coefficients— 
inhomogeneous case 


We now examine methods of solution of the inhomogeneous differential 
Egn (15-1). Our approach will be to progress from a semi-intuitive method 
known as the method of undetermined coefficients through a rather more 
systematic treatment using the operator D, which will be introduced later, 
and thence to the method of variation of parameters. To complete the chapter, 
a brief introduction is given to the solution of linear differential equations 
by means of the Laplace transform. 

It is an easily verified fact that y = Ci cos x + Cosin x + 6.1 is a 
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solution of the inhomogeneous equation γ΄ + y = e~*. The first two terms 
of this solution obviously comprise the complementary function of the re- 
duced equation ν΄ + y = 0, whilst the last term is a function which, when 
substituted into the differential equation, gives rise to the inhomogeneous 
term. There thus appear to be two distinct parts to this solution, the first 
being the general solution to the reduced equation and the second, which is 
additive, being a solution particular to the form of the inhomogeneous term. 
We now prove a theorem that establishes that this is in fact the pattern of 
solution that applies to all inhomogeneous linear equations. The sum of the 
two parts is termed the general solution or the complete primitive of the 
inhomogeneous equation. 

To simplify manipulation it will be convenient to introduce a concise 
notation for the left-hand side of differential Eqn (15-1) and we achieve this 
by defining L[y] = γι) + ay") Ἢ. + ++ any. In terms of this notation, 
in which αι, ao, . . ., Qn are understood to be constants, we now state: 


THEOREM 15-4 (form of general solution of linear inhomogeneous equations) 
The general solution of the inhomogeneous equation L[y] = f(x) is of the 
form y(x) = γε(χ) + yp(x), where yc(x) is the general solution or comple- 
mentary function of the reduced equation L[y] = 0, and y (x) is a particular 
solution of L[y] = f(x). 


Proof The proof is straightforward. Firstly, y(v) = ye(x) + yp(x) does satisfy 
the equation since L[ye(x)] = Ὁ and L[yp(x)] = f(x), and as (d"/dx")(ve + yp) 
= (d’y-/dx") + (d”yp/dx") for r == 1, 2,.. ., n, it follows that 


L(y) = Lhye(x) + yo) = LEye)] + Εν» }}. 
= 0 + f(x) = f(x). 


As yc(x) contains n arbitrary constants we choose to write it in the form 
yx; C1, Ca,. . ., Cn) to make this explicit. Then, clearly, adding two comple- 
mentary functions with differing constants C; and Ci’ gives 


Vex; οι Co, eo 4 8g CA) + Vex; Ci: C2’, © 0 85 Cn’) 
= γοιίχ; Ci + Cy’, Co + Ca’,. . ., Cn + Cy’), 


which is simply the same form of complementary function but with modified 
constants. Suppose next that yip(x) and yap(x) are two particular solutions of 
110] = ΚΟ). Then Llp) — yep) = LI] — Liye) = (Ὁ) — f00) 
= 0 and so pip(x) — yap(x) is a Solution of L[y] = 0. Hence yip(x) — yep(x) 
= ye(x; C19, C2. . ., Cn®), which is again the same form of complementary 
function but with some other set of constants. 
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Now 


Vex; Ci, Ca,. . «, Ca) + yip(x) = γείχ; C1, Ca, . . ., Ca) + yip(x) 
— yap(x) + yop(x) 
= ye(x; Ci, Co,. . ., Cn) 
+ yel(x; C19, Co®,. . ., Cn) 
+ yep(x) 
= yelx; Ci + C19, Co + Co. . ., 
Cn + Cn®) + yap(x), 


and so we have shown that any two particular solutions give rise to the same 
form of general solution. The arbitrary constants in this general solution must 
be determined by applying the initial conditions to the complete primitive 
) = yx) + yp(x). 

We have already seen the simplest example of this theorem in connection 
with Eqn (14-20), which clearly displays the two parts of the general solution 
of an inhomogeneous linear equation of first order. As in that section, we 
shall call the particular solution y p(x) of the inhomogeneous equation a 
particular integral of the differential equation. 


15:2 (ᾳ) The method of undetermined coefficients 


The determination of simple particular integrals by means of the method of 
undertermined coefficients is best illustrated by example. In essence, the method 
is based on the fact that simple forms of inhomogeneous term f(x) in Eqn 
(15-1) can only arise as the result of differentiation of obvious functions in 
which the values of certain constants are the only things that need deter- 
mination. A solution is achieved by substitution of a trial function into the 
inhomogeneous equation and subsequent comparison of coefficients of 
corresponding terms. 


Case (a): f(x) a polynomial in x. Suppose γ΄ + y = x?, then by inspection 
we see that the particular integral yp can only be a polynomial in x and, 
furthermore, that it cannot be of degree higher than 2 since the equation 
contains an undifferentiated term y. Let us set yp = ax? + bx +c, then 
yp" = 2a and substitution into the original equation gives 2a + ax® + bx 
+c = x?. Equating the coefficients of corresponding powers of x shows that 
a=1, b=0, 2a+c=0, so that c = —2, and the required particular 
integral must be yp = x? — 2. The general solution, or complete primitive, 
is y = Ci cos x + Cosin x + x? — 2. To determine the solution appropriate 
to the initial conditions y = —2, γ΄ = 0, say, at x = 47 we notice that the 
condition on y gives —2 = ΟἹ cos ἐπ + C2 sin ἐπ + 4n? — 2, or ( = --ὰἰπῇ, 
whilst the condition on y’ gives 0 = —Cisin}a + Czcos37 +7, or 
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C, = 7. Hence the solution appropriate to these initial conditions is y = 
m(cos x — 4dmsin x) + x? — 2. 

The method extends directly to linear constant coefficient equations of 
any order and to polynomials f(x) of any degree. For example, using the 
same argument to determine the particular integral of y” + 3y’ + y = 1+ x8 
we would try a particular integral of the form yp = ax? + bx® + cx +d. 
Substitution in the equation and comparison of the coefficients of corres- 
ponding powers of x would then show that a= 1, 9a - ὃ =0, 6a + 66 
+ c= 0, and 2b + 3c + d= 1. These equations have the solution a = 1, 
b = —9, c = 48, and d = —125. The particular integral must be yp = x? 
— 9x? + 48x — 125. 


Case (b): f(x) an exponential function. Suppose now that γ΄ + 3y’ + 2y 
= 3e2%, As εὖ does not appear in the complementary function ye = Cie~* 
+ C2e-2* (in which case it would be a solution of the homogeneous equation), 
it can only arise in the inhomogeneous term as a result of differentiation of a 
function of the form yy = ke?*, Substituting this in the equation and can- 
celling the common factor e?” shows that 4k + 6k + 2k = 3 ork = 5. The 
required particular integral must thus be yp = 1655. A sum of exponentials 
occurring in the inhomogeneous term would be treated analogously, the 
constant multiplier of each being determined separately by the above method. 
A complication arises if an exponential in the inhomogeneous term also 
occurs in the complementary function, as is the case with y” + y’ — 2y = 2e%, 
which has the complementary function ye = Cie* + Cee**. Attempting to 
find a solution by substituting yp == ke* would fail here since e* is a solution 
of γ' + γ' — 2y = 0. A moment’s reflection and consideration along lines 
similar to those concerning Theorem 15:2 shows that in this case we must 
try Vp = kxe®. Then yp’ = Κ([ + xje? and yp” = k(2 + x)e%, so that 
substitution into the differential equation and cancellation of the common 
factor εὖ gives the condition 3k = 2. In this case the required particular 
integral is yp = $xe*. By an obvious extension, if an exponential term οὐ 
appears in the inhomogeneous term f(x), and also occurs in the comple- 
mentary function as a result of a root of the characteristic equation which 
has multiplicity r, then the particular integral will be of the form yp = kx’e* 
Suppose, for example, that y” — 2y’ + y = 2e*%, then the complementary 
function ye = (Ci + Cex)e* arises as a result of the double root A = I of the 
characteristic equation P(A) = A* — 2A + 1 =0. Hence we must seek a 
particular integral of the form yp = kx*e*. Substituting this in the equation 
and cancelling the common factor e* shows that k = 1; hence the particular 
integral 15 yp = x%e%, The general solution is y = (C1 + Cex)e? + x%e. 


Case (c): f(x) a trigonometric function. The same method may be applied to 
an inhomogeneous trigonometric term involving sin mx or cos mx. In this 
case, provided neither function occurs solely as a term in the complementary 
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function, a particular integral of the form yp = acos mx + b sin mx must 
be sought. Let us consider y” -- 2y’ + 2y = sin x, which has the comple- 
mentary function ye = e*(C; cos x+ C2 sin x). Substituting yp = acos x 
+ bsinx gives the result —acosx — bsinx + 2asinx — 2bcos x + 
2a cos x + 2b sin x = sin x. For this to be an identity we must equate the 
coefficients of sin x and cos x on each side of the equation. For the term 
sin x this gives rise to the equation 2a + b = 1, and for the term cos x, the 
equation a— 2b =0. Hence a = 2/5, ὁ = 1/5, so that yp = $(2 cos x 
+ sin x). The general solution is of course y = e*(Ci cos x + C2 sin x) 
+ ξ cos x + sin x). 

If, however, the trigonometric function in the inhomogeneous term occurs 
as part of the complementary function then considerations similar to those in 
the latter part of Case (b) will apply. We must then try to find a particular 
integral of the form yp = x’(acos mx + bsin mx), where r is the multi- 
plicity of the root of the characteristic polynomial giving rise to the term 
cos mx or sin mx in the complementary function. 

Suppose, for example, that γ' + y= οοβ χ, then the characteristic 
polynomial is P(A) = λ5 + 1 = (A — i)(A + ἢ, showing that terms in the 
complementary function ye = Cicosx + Czsinx arise from roots of 
P(A) = 0 having multiplicity 1. We must thus attempt a particular solution 
of the form yp = ax cos x + bx sin x. Substitution into the equation gives 
—2a sin x + 2bcos x ~— ax cos x — bx sin x + axcos x + bx sin x = cos x. 
Equating coefficients of cos x and sin x as before shows that a = 0, b = 3, 
and hence yy = $x sin x. 


Case (4): f(x) a product of exponential and trigonometric functions. The 
previous methods extend to allow the determination of a particular integral 
when the inhomogeneous term is of the form e** cos mx or e** sin mx. In 
this case a solution of the form yp = e**(a cos mx + bsin mx) must be 
sought. In the case of the equation y” — 3y’ + 2y = e- sin x, as the comple- 
mentary function ye = Cie” + Ce? does not contain the inhomogeneous 
term, we seek a solution of the form yp = e-*%(acos x + bsin x). Sub- 
stituting and proceeding as before it is easily established that a = ὁ = 1/10, 
from which we deduce that yp = q/ye~*(cos x + sin x). 


15:2 (Ὁ) The operator D 


Let us now introduce a new method of solution of constant coefficient 
equations, making use of the differentiation operator D. By the operator D 
we are to understand the operation of differentiation already used in Leib- 
nitz’s theorem, so that D = d/dx, D? = d?/dx? and, in general, ἢ = d”/dx”, 
where for the moment ἢ is a positive integer. 


DEFINITION 15-1 (polynomial operator) We define the polynomial 
operator 
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ον; P(D) .- DY + a,D 1 4--+ +a (13:14) 
by 
P(D) f(x) == (D"® + ay DP + + ὐ + an) f(x) =f) 
as aig Gy) tee et anf(x), 


where f(x) is any suitably differentiable function. 


The number a, in the operator P(D) is, of course, to be understood to 
mean a» times the identity differentiation operator, which we may write as 
D°®, with the understanding that D®y = y. It is conventional, as in Eqn 
(15-14), to omit this identity differentiation operator since we shall see later 
that no confusion arises if we write a, in place of an D®. 


DEFINITION 15-2 (Sum and product of operators) Given two polynomial 
operators P(D) and Q(D), we define the operators P(D) + Q(D) and 


P(D)Q(D) by | 
[P(D) + ODF) = P(DIf0C) + QDI) 
and 


[P(D) QCD) fx) = P(DILA(DIYCdI, 


where f(x) is any suitably differentiable function. 


We shall say two operators P(D) and Q(D) are equal, and will write 
P(D) = Q(D), if for all suitable f(x), P(D) f(x) = QO(D)f(x). It is important 
to recognize that P(D) and Q(D) so defined are operators and not functions, 
in the sense that they only give rise to a function when they operate on some 
suitably differentiable function. — 


Example 15:7 In operator D form the differential equation γ΄ — 3y’ 
+ 2y = xe-* may be written (D? — 3D + 2)y = xe~*, where here P(D) = 
D? — 3D + 2. By our first definition we could insert parentheses and equally 
well write P(D) = D? — (3D — 2) = (D? — 3D) - 2. 


It is an immediate consequence of this notation that if a is a constant, and 
fand g are suitably differentiable functions, then 


D(aD")f = aD"'f (15-15) 
and 

P(D)Lf + εἰ = P(D)f + P(D)g. (15:16) 

: d drf dr 

The first result follows because D[(aD*)f] = ΤΕ a ἢ =a ara 


= aD'f, 
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Here we have used the fact that a constant multiplier of a function 
commutes with the operation of differentiation so that (d/dx)(a¢(x)) = 
a(ddé/dx). The second result follows because if P(D) = Sa,D"~', then 

P(D\f + g) = Lat f+ 5)" ΧΑ) = Σὰ [ P(x) + Larg (x) 

= P(D)f + P(D)g. 


Other important results that may be established in similar fashion are: 


P(D) + Q(D) = Q(D) + P(D) (Commutative law); (15-17) 

[P(D) O(D)| RCD) = P(D) [Q(D) ΚΑΡῚ (Associative law); (15-18) 

P(D)[Q(D) + R(D)] = P(D)Q(D) + P(D) Κι δὴ (Distributive law). 
(15-19) 


A particular case of the last result is 


(D — 2 Ο(Ρ) = ΡΟ(ΡῚ — AQ(D). (15-20) 


THEOREM 15:5 (the factorization theorem) Suppose that the polynomial 
A® + ahh l4t--+++ ay, has factors A — δι, A~ do, . . ., A — An. Then 
D'+aDri-c+:- -+ an = (ἢ — 41)(D — do). . .(D — An). 


Proof The proof of this result is by induction. Suppose that 2” + αἀιλη-1 + 
ΠΣ ΣΈ dn = (A = AAP? + bd? + + + bn). Then ay = δι — Ay, 
ag = be = byA, ~ 2 oy Gan-] = bn-1 ssa by_2A1, Qn = —bn-1A1 and Az-1l + 
Byhr? +) + δ,.- = (A -- λο)(λ — 5). .. (A = An). 

Assume the resuJt to be true for polynomial operators of degree n — 1. 
Then 


Dr14+ 6Dr 24-6 +4 a (8 — A2)( D -- A3) be ode (D — An). 


Now (D -- λ:)δη.: + ῥδιρητ. . ..-.« ῥ,..1) 
= ῬΑ. + δι Dt? + + + + + δ...) — ACD! + 6, D2 
δ θά. ἤν οὐ] 
= Dr + (ῥὶ ΞΕ λι)»"-ἰ + (be ee byA;) Dv? 4+... 


+ (bn-1 — bn-241)D — bn-rdy 
= D® + 4,D" 14 aDr2 +:+++ ap. 


But (D — A,)(D""! + 6; D"-2 + - + - + bn) = (Ὁ — A1)(D — 25). . .(Ὁ - 
An), and therefore 

D®'+a,D™!14+-->++a,-1).D+ a, = (D -- Ai)(D = A2) er (D -- λ,). 
This proves the hereditary property, and the result is clearly true for n = 1, 
so by induction it is generally true. 

Since the choice of the factor A — A, with which the above proof was 
started was arbitrary, we have: 


Corollary 15-1 If pi, p2,. . ., pn isa permutation of the numbers 1, 2,. . .,n, 
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then 
(D — λιχὸ — 22). . .(D — An) = (D — 1, KD — A, ). . (Ὁ -- 4,). 
It follows that 
P(D)Q(D) = Q(D)P(D). (15-21) 


Example 15-8 In D operator notation the equation y” + 5)’ + 67 = cos x 
becomes (D? + 5D + 6)y = cos x. Because of the factorization theorem and 
its corollary, it follows that the quadratic polynomial operator (D2 + 5D + 6) 
=: (Ὁ + 2)(D + 3) = (Ὁ + 3\(D + 2). 

Let us now briefly recapitulate our discussion of the solution of the 
reduced equation, but this time using the operator D. 


Example 15-9 (distinct roots) Consider the general second order equation 
which in factorized form may be written 

(D — hi)(D — ha)y = 0, 
where A; # Δ». Now set (D — /2)y =u so that the equation becomes 


(D — λιὴμ = 0, or, 


— — Aw = 0. 


dx 


This has the solution μ = Cie" so that now we must solve (Ὁ — ds)y = 
Cyie"!", which is simply the familiar first order linear differential equation 


dy 
aoa doy = (τοῦτ 
dv 


with integrating factor μ =e “ἢ, Hence 


d Shee 
— (ye A20) = Cie! 22) 


dx 


so that the general solution is 


ef met | Ci Ayx Agx 
ye=(z—Je + Coe", 
where Cz is another arbitrary constant of integration. 

The extension to a polynomial operator of degree n is immediate provided 
the roots are distinct. As the constants are arbitrary the divisor (21 — λ9) 
may be omitted from the first coefficient of the general solution (that is, 
introduce a new constant Cy’ = Cy/(Ai — λ9}}. 


é 


Example 15:10 (repeated roots) This time let us consider a third order 
equation but assume that two of the roots are equal, so that in factorized 
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form it may be written 
(D — λ)ὴξ( ὃ — λοὴν = 0. 


Changing the order of the factors and setting (D — 41)*y =u, the 
equation becomes (D — λο)μ = 0, with the solution u = Cye””. Hence we 
must now solve (D — λι)ὸν = Cie”*. So, writing v = (D— λὴν, this 
equation simplifies to (Ὁ — Ai)v = Cie". The integrating factor is μ = e~”!”, 
and an application of the argument of the previous example with re-definition 
of constants where necessary brings us to the solution v = Οἵ εἴ + Ce”, 
Finally, we must solve (D — λὴν = (ι΄ εἶδ + Cze”. This also has an 
integrating factor « = e~“!”, so that 


d 
F< (ye) = Cre + Cs, 


and hence 
Ve= Cy"e*2* + (Cox + C3)e”!*. 


As before, constant divisors of the form Az — A; have been omitted and the 
arbitrary constant re-defined. This method has thus automatically generated 
the two linearly independent solutions e*!” and xe*!* corresponding to the 
two repeated factors (D — 41). An application of the method to a factor with 
multiplicity r generates the linearly independent terms discussed in Theorem 
15:2. 


Example 15:11 (complex conjugate roots) If a polynomial P(A) with real 
coefficients is such that P(A) = 0 has a complex root 4 = u + iv, then we 
know it must also have a root A = uw — iv. Consequently, as in our previous 
study, we know that the corresponding term in the particular integral must be 


eC, cos vx + (ὦ sin yx). 


Also, if the roots have multiplicity m, then the corresponding term must 
be 
e’" Pm—1(x) cos vx + Qm-1(x) sin vx], 


where Pm-1(x) and Qm-1(x) are polynomials in x of degree m — 1 having 

arbitrary coefficients. These terms must be added to the other terms that arise 

from the real roots of P(A) = 0 to obtain the complementary function ye. 
Consider the equation 


(D> — 5D4 + 125 — 16D? + 12D — 4)ν = 0. 
In factorized form this becomes 
(D — 1)γ00 ---Ἰ -- ξ --α᾿τ,τ i)*y Ξξ 06, 
showing that the real factor D— 1 has multiplicity 1 and the complex 


conjugate factors in which μι = 1, ν = 1 have multiplicity 2. The comple- 
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mentary function ye is thus 
Ve = e*[C1 + (ὦ + C3x) cos x + (Ca + C5x) sin x]. 


Although these applications of the operator D are of interest, they offer 
no real advantage over the first method we discussed for the solution of 
homogeneous equations. In both cases the same process of factorization is 
required, after which the complementary function may be written down by 
inspection. The real advantage of the operator method 15 1n the determination 
of particular integrals as we now show. 

Writing an inhomogeneous equation in the compact form 


P(D)y = 70), (15-22) 


we are tempted to regard this as an algebraic equation and to write 


Vp I) (15-23) 


~ P(D) 
for the particular integral. 

Since P(D) is an operator, the expression 1/P(D), if it can be given a 
meaning, must also be an operator such that when applied to f(x) it generates 
the particular integral. Indeed, substitution into Eqn (15-22) shows that an 
appropriate name for 1/P(D) would be the inverse operator. This is so because 
when 1/P(D) is applied to P(D) it obviously generates the identity operator. 
An inverse operator of this kind can in fact be satisfactorily defined if we 
approach the problem with care. For example, in the very simple equation 
Dy = x, we know that y must be some function which when differentiated 
will yield x. It is obvious that in this case y = $x? + C, so that when we 
write, symbolically, y = (1/D)x we must interpret 1/D as implying the deter- 
mination of an antiderivative, that is to say, as the ordinary operation of 
integration. Similarly, 1/D? represents the operation of integration twice 
repeated. It is often convenient to use the properties of indices to write the 
n-fold repeated operation of integration 1/D” in the form ἢ) 5. 

Care must.always be taken to indicate on what function the inverse oper- 
ator is to act. An expression of the form fg/D is ambiguous, since it could 
mean either D-!( fg) or f(D~1g), which are two different functions. 

In Theorem 15-4 we saw that particular integrals differ only by terms 
belonging to the complementary function, so that integration constants may 
be omitted when using the operator D for the determination of particular 
integrals. Retaining these constants of integration will generate both the 
complementary function and the particular integral. | 

The fundamental equation that gives meaning to the operator (D — A) is 
the linear first order inhomogeneous equation 


dy, 
dx "" Ay = f(x), 
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which in operator notation takes the form 


(D — λὴν = f(X). (15-24) 
We already know from Egn (14-20) that the particular integral is 
vey Ξε δ (fixe @ dx, (15-25) 
so that it follows from Eqns (15-24) and (15-25) that 
| 
yo(x) = ae f(x) =e f fixe dx. | (15-26) 


The expression on the right-hand side thus gives meaning to the inverse 
operator (D — A)-} acting on the function f(x) and will be taken as our 
fundamental definition. 


DEFINITION 15:3 (inverse operator) We define the effect of the inverse 
operator (D — A)~! acting on a function f(x) by the expression 


(D — λὴ:-:1Γ(ἹἹ = ef f(xje*" ἀν. 
Example 15:12 We shall determine the particular integral of 
(D — 1)(D — 2)» = εξ, 


which will necessitate two applications of the inverse operator just defined. 
First, using the inverse operator (D — 1)~}, and identifying 4 with 1 and 
f(x) with e* in Eqn (15-26), we have 


(D — 2)ν 


Ι 
EN 
oS 
| — 
Ο 
& 


Then, using the inverse operator (D — 2)~1, identical reasoning gives 


Ι 
y= x 
Vp ( Diz. 5] Xe 


= e2t { yet ec 24 dx = —(x + I)e*. 


Hence the desired particular integral is yp = —(x + l)e”. Notice that the 
fact that the inhomogeneous term e* also occurs in the complementary 
function ye = Cie* + Cee?* has been automatically accounted for by this 
method, and so no special case need now be distinguished in this respect. 


Example 15:13 The application of the operator defined in Eqn (15-26) to 
complex factors is equally straightforward. Thus if 


(D? — 2D + 2)y = e%, 
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then after factorization we have 
(D-1—i(D—1+4+ ἣν Ξε ες. 
Proceeding as before then gives 


\ 


pea) 


(D1 +iy=( 


-Ξ e(lt+oz j ev 2 e—1+ta dx = ie, 
Hence, 


3 
= ee 
a ὩΣ Ξ τ 
--- e(l-iz f lex : e—(1-i)ax dx —_ δ; 


showing that the required particular integral is pp = οζ, 


SOME SIMPLE RULES In special cases the inverse operator just defined 
simplifies to give some easy rules for the determination of yp. Suppose first 
that P(D)y = Κε ἢ, where P(D) == D” + a, D"-14- - - + ay, and where « 
is not a root of the characteristic polynomial P(A) = A" + αιλητὶ - - -- 
+ an = 0. By direct differentiation it is easily shown that 


(δ᾽ +a,Dr1t+.-.-- + a,)je _ "(an + ayan-l Ἔ:- ἄχ). 
which is a result that may be expressed by the relation 
P(Dye* = e P(x), (15-27) 


where βία) is just a number obtained from P(D) by formally replacing D 
by a. As « is not a root of P(A) =0 we have μία) #0, and since 
[P(D)] -1P(D)e* = εὐ, comparison of this identity with Eqn (15-27) gives: 


Rule 1 
If P(«) ~ 0, then 
l a ert 
= 
P(D) P(a) 


Applying Rule 1 to Example 15-13, in which P(D) = Ὁ — 2D + 2, 
shows that we are required to solve 
Lt 4 
ye = P= Ip πο ως 
Here « = 1, so that P(«)=1—2-+2 = 1, whence Vp = e*. The rule is 


inapplicable in Example 15-12 because in that case P(«) = 0, and so operator 
Egn (15-26) must be used instead. 
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Another simple case occurs on account of the obvious results D2” cos mx 
= (—1)"m?" cos mx and D2" sin mx = (—1)"m*" sin mx. If the operator 
only contains even powers of D, which we denote by writing P(D?), then, 
arguing as in Rule 1, it is easy to establish: 


Rule 2 
If P(D) only contains even powers of D and so may be written in the form 
P(D?) then, providing P(—m?) + 0, 


sin mx aud | cos mx 
—— sin mx = n —— cos mx = ———_- 
P(D?*) P(—m?) P(D*) P(—m*) 


If, for example, (D4 — 3D? + 2)y = cos 2x, then since m=2 and 
P( D2) = D4 —3D*+ 2, it follows that P(—2*) = (—4)? — 3(—4)4 2 
= 30. Hence by Rule 2 we find that yp = (1/30) cos 2x. 

Another rule may be deduced by applying formula in Egn (15-26) to the 
function x’ and then comparing the result with the effect of the operator 

I --ἰ Ι DD 
--Ξ.-- τί : ) (15-28) 

applied to x’. This operator, which arises when (A — D)~! is formally ex- 
panded by the Binomial Theorem, is seen to give the same result when applied 
to χϑ as an application of Eqn (15-26) and, because Ds*+!xs = 0, the expansion 
may be terminated after the term D’. Applying this operator to a polynomial 
of degree m establishes: 


Rule 3 


b b Στὰ b m) = (14545 oe =| 
(bo + O1x + + Dmx") = 3 7 γς Τ Te 


(bo + bix +++ + + bmx”), 


D—-i 


To illustrate this, let us suppose that (D — 4)y = 1 + x?, then yp = 
(D — 4)-\(1 + x?). By Rule 3 we see that we need to set n = 2 and A= 4, 
so that yp = (Ὁ -- 4-1 + x2) = —(1/4)[b + (D/4) + (D?/42),)1 + x). 
Performing the indicated differentiations we find that yp = —(9 + 4x + 
8x?)/32. 


The particular integral corresponding to the more general expression 
P(D)y = bo + διχ +> + + + Omx™ 


may be deduced by factorizing the polynomial operator P(D) and making 
repeated use of Rule 3. 
Sometimes it is useful to reformulate Rule 3 so that, if desired, it may be 
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applied directly to operators P(D) without resolving them into simple first 
order factors of the form (D — A). This is certainly necessary if P(D) has any 
quadratic factors corresponding to pairs of complex conjugate roots of the 
characteristic polynomial. The desired modification of Rule 3 is easily 
achieved by using a repetition of the previous arguments to arrive at: 


Modified Rule 3 | 
If P(D) = δὴ + a,D™1+4+--+ ++ an-1D+ an, and P(D) is expressed in 
the form 
P(D) = an[l — Q(D)I, 
so that 


Bop ὁ. Ἔ BIN Ἔ 0 Ὁ 0Ὲ bm) = (1 + QCD) + OND) Ὁ τ τ 


+ Q™D))(bo + bix +++ + + bmx™). 
By way of example suppose (D2? -- 3D + 2)y = x2 + 1. Then P(D) = 
D2 — 3D + 2, and in the above notation a; = —3, az = 2, showing that 


Q(D) = —4D? + 3D. Since the polynomial is of degree 2 we have m = 2 
and the Modified Rule 3 then gives 


) (2+ 1) = Hl + (-408 + 4D) 
+ (—4D2 + 3D)7](x? + 1). 
Performing the indicated differentiations we finally arrive at the result 
Yop = 3x27 + 3X44. 


The final rule concerns the operator inverse to P(D)[e’*u(x)], where u(x) 
is any suitably differentiable function. By direct differentiation we have 
Dr(e““u) = D'-1D(e*u) = Dt-l(ue*u + e’ Du) = D*1e"(u + D)u. Simi- 
larly, D'—e(u + D)u = Dr-?Dile*(u + D)u] = Dr-e"(u + Du =... 
= e"(u + D)'u. Hence, applying the argument to a polynomial P(D) gives 


P(D)[e“*u(x)} = e[P(D + w)u(x)]. (15-29) 


I 
= (53553 


To derive the inverse operator rule we now set P(D + y)u(x) = v(x), 
which may be any differentiable function, since u(x) is otherwise unspecified, 
so that u(x) = P-1(D + yw)v(x). Then, using these results in Eqn (15-29) 
gives: 
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Rule 4 
If v(x) is a suitably differentiable function, then 


P(D) (ey) = ef ΠΡΕΝΝΣ =a) υ. 


We illustrate this rule by considering the determination of the particular 
integral of the equation (D? + 5D + δὴν = χοῦ, Here P(D) = D? + 5D 
+6 and w= —l, so that P(D + μὴ) = (Ὁ -- 1)? 4+ 50 -- 1) + 6 = (PD? 
+ 3D + 2). Then by Rule 4, yp = e-*(D? + 3D + 2) "x. Factorizing this 
result and using Eqn (15-28) then gives 


=e αι - D+ De (5) {{- τ τ - τη} 


Expanding the square bracket as far as terms involving D, because the 
operator is only acting on x, we find that 


15:3 Variation of parameters 


An alternative approach to the determination of the complete primitive when 
the complementary function is known is provided by the method of variation 
of parameters which we now describe for a second order equation. 

Consider the linear constant coefficient inhomogeneous second order 
equation 


y" + ay’ + by = f(x). (15-30) 


τ _ Then if γι and yz are the two linearly independent solutions, the comple- 

mentary function γε(χ) = Ciyi(x) + C2ye(x), where (Οἱ and ( are constants. 
We shall now try to find functions C(x) and C2(x) such that the solution to 
the inhomogeneous Eqn (15-30) is 


γα) = Cix)yi(x) + Co(x)yo(x). (15-31) 
If this result is true, then denoting differentiation with respect to x by a prime, 

y (x) = Cilx)yi'(x) + Co(x)y2'X), (15-32) 
provided we require that — 

Cy’(x)yi(x) + Co’(x)ya(x) = 0. (15-33) 


Then, by differentiation of Eqn (15-32), we also have 


676 / HIGHER ORDER DIFFERENTIAL EQUATIONS CH 15 


γ΄) = Cix)yi"(x) 4+ νε΄) + Cr'(x)yi') + Ca’(x)ye'(x). (15-34) 
Substituting Eqns (15: 3) (15-32), and (15-34) in Eqn (15-30) and grouping 
terms gives 

Ci(x)[yi"(x) + ἀγι (Χ) + byiQX)] + Cox) [ye"(x) + aye'(x) + byo(x)] 

+ Cr'(x)y1'(X) + Ce’(x)ye’(x) = f(x). 


As yi(x) and ye(x) are solutions of the homogeneous equation (the reduced 
equation), the coefficients of Ci(x) and C2(x) vanish identically leaving 


Cr'(x)yr'(x) + Ca'(x)ye'(x) = f (>). (15-35) 


Equations (15-33) and (15-35) may now be solved for C1’(x) and Ce’(x) 
in terms of the known functions yi(x), ya(x), and f(x), so that the desired 
functions C(x) and Co(x) can then be determined by straightforward 
integration. We obtain 


6O=G= } 2 ἂν (15-36) 
C(O) = Co + | ca dx, 


where | W| = γιγο΄ — yi'ye and (ΟἹ and Ce are constants. The divisor | W | 
will be recognized as the Wronskian of the functions yi(x) and ye(x). 
The complete primitive is 


yop yf 
= dx} yi(x) + lc: +| ae dx} y2(x). (15-37) 
γα - pepe} W 

We illustrate this method by applying it to the example used in connection 
with Rule 4. The equation γ΄ + 5y’ + 6y = xe has the complementary 
function ye = Cie 3% + Cee 2%, showing that γι = οἱ 85 and yo = e-2%, 
Here | W | = e-5*, so that Eqn ((5. 37) takes the form 


y = {Ci = [χες dx}e—82 + {Co + Jxe* dx}e~2*, 


Evaluating the integrals and combining terms where possible shows that 
the general solution is 


— σις 835: ΜῈ Coe 25 4. lye-t — 3e-2, 
2 4 


The method of variation of parameters may be extended to linear equa- 
tions of any order provided suitable extra supplementary conditions of the 
form Eqn (15-33) are introduced as each successive derivative is computed. 
Thus for an πίῃ order equation involving the functions C(x), Co(x),. . ., 
Cr(x) and the independent solutions yi(x), ye(x), . . .. ¥n(x), we would need 
to introduce the (7m — 1) supplementary conditions Cy‘yi + (κ΄ γε) +° 

Ὁ Ομ νη) = 0, withr =0,1,...,n—2. 


SEC 15:2 LINEAR EQUATIONS—INHOMOGENEOUS CASE / 677 


15:4 Simultaneous linear differential equations 


Many important physical processes are described by systems of simultaneous 
linear constant coefficient differential equations. Typical examples are the 
equations describing interacting control systems, interacting electric circuits, 
and reversible chemical reactions. The following two first order equations 
involving the dependent variables x and y together with the independent 
variable ¢ provide a simple illustration of this type of problem and its 
solution. 

Let us determine x and y as functions of ¢ if initially at ¢ = Ὁ we have that 
x = —}, y = 1 and, subsequently, 


x=x+yt+t 

ypo3x-y, 
where the dot denotes differentiation with respect to ἡ. 

From the first equation we see that y = x — x — ἔ, so that eliminating y 
from the second equation then gives y = 4x + t — x. If now we differentiate 
the first equation with respect to t we obtain ¥ = X + y + 1, which can be 
combined with the previous result to give ¥ — 4x = t + 1. This is a second 
order inhomogeneous equation involving only the dependent variable x. 
Using any of the methods previously described it is easily established that 
its general solution is 


x = Acosh 2t + Bsinh 2. — Κ] + ἢ. 


To determine y we now substitute this expression for x in the first of the 
simultaneous equations to obtain 


y = A(2 sinh 21 — cosh 2t) + B(2 cosh 21 — sinh 27) — 3¢.. 


Inserting the initial conditions for x and y into these expressions shows 
that A = 0, B = } and so the required particular solution is 


x = 4sinh2¢ — i(1 + ἢ), 
4(2 cosh 2t — sinh 2t) — 2¢. 


Notice that had we inserted the general solution for x in the second of the 
simultaneous equations we should have obtained a first order inhomogeneous 
equation for y which would have apparently necessitated the introduction 
of a third arbitrary constant into the solution. This anomaly is easily resolved 
by recalling that the value of y so determined, when taken with x, must be 
compatible with the first equation. Hence the apparently additional arbitrary 
constant required by this approach is in fact dependent on A and B. This 
complication is completely avoided by the method adopted here. 


678 / HIGHER ORDER DIFFERENTIAL EQUATIONS CH 15 


15-5 Series solution of differential equations 


We have seen that a differential equation is in fact one of the many ways in 
which functions may be defined. Thus, for example, the simple equation 
γ΄ + y = 0 defines the two linearly independent functions cos x and sin x 
which appear in its general solution. As functions such as these may be re- 
presented by power series it is reasonable to attempt to derive power series 
solutions directly from the differential equation. 

We now describe two simple methods that may be used; the connection 
between them will become obvious as we proceed. The first method is the 
more general and makes direct appeal to the Taylor series expansion of a 
function y(x) about the point x = x9: 


ac 


| I 

yx) => τὸ -- χοῦν). ([5:38) 
n=O ft. 

The second method determines the coefficients cp in the general power series 

expansion 


οὺ 


y(x) = Σ cal — Xo)”. (15-39) 


r= 


Method 1 


Let us use the Taylor series expansion to determine the series solution of 
γ᾽ + y =x subject to the initial conditions y= 1, γ΄ =O when x = 0. 
Successive differentiation of the differential equation with respect to x yields 
γι τι χ -- γ, γίϑ τι ᾿ — yD, yO = —y@ . ym = —yin-2 (> 4), 
Combining these results shows that derivatives of order 2 and above are all 
determined in terms of y, y, and x by the equations γί2) = —(y — »), 
y® = —(y) ~ 1), yp? = (y — χ), yO = yD — 1, yO --- (ν -- wy), ο, 
Since the initial values of y and y’ = y™) are known at x = 0, we can thus 
determine y)(0) for n > 1 and thereafter substitute their values into Eqn 
(15-38). Doing this we obtain yo = 1, yo'? =0, yo = —1, yo? = 1, 
yo” = 1, yo = —1, yo’ = —1,. . ., so that setting x9 = 0 in Eqn (15-38) 
gives the Taylor series expansion 


ya) = 1+ x.) ἘΣ Ὁ Ἐπ + FHC) 


I Ι 
+ τ Χ--) -+- εἰ Χ (--1) +.... 


Comparison of this result with the series for e* shows that y(x) is absolutely 
convergent for all x, and hence it is also convergent for all x. 

In this case the result may be expressed in more familiar form by using the 
fact that terms in an absolutely convergent series may be rearranged. 
Grouping the terms as shown below and adding and subtracting x to the 
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right-hand side gives 
x2 Χο. x8 χϑ χὅ 


which we immediately recognize as the series expansion of y = x + cos x — 

sin x. Although in this case the solution could have been obtained directly by 

using our previous methods, it is still possible to solve differential equations 

by this method when the solution cannot be expressed in terms of elementary 

functions. For example, the non-linear differential equation y’ = x2 — y? 

with y = 1 at x = 0 could equally well have been solved using this approach. 

Had the general solution of γ΄ + y = x been sought in terms of a power 

series, without initial conditions being stated, the same argument would have 

produced 

— x0)2 eu eA 

y=x+(o— Xo) τς Ὁ ea, : j 

— x,)3 — χρὴδ 

+ (yo? — 1) | = xe) — SS δ πο. | 

3! 5! 

where yo and yo‘! are the functional value and its derivative at an arbitrary 

value x = xo. Here the constant multipliers (vo — xo) and (yo"!’ — 1) take 

the place of the two arbitrary constants A and B occurring in the more usual 
form of the general solution y = x + A cos (x — xo) + Bsin (x — Xo). 


Method 2 


An alternative approach that may be used is to substitute the power series 
Egn (15-39) into the differential equation and to equate coefficients of cor- 
responding powers of x. This is an essentially simpler approach than the 
previous one and is not usually successful if the equation is non-linear in y. 
We illustrate it by obtaining the power series solution of y” + xy =0 
subject to the initial conditions y = 1, y’ = 0 at x = 0. Since the expansion 


is required about xo = Ὁ we set y = co + οὐχ + cox? - - +, from which it 
follows that γ΄ = cy + 2cex + 3c3x2 +--+ and yp” = 2.9 + 6c3x + 12c4x? 
+--+. Substitution into the differential equation gives 


(2.9 + 6c3x + 12c4x? + 20c5x3 + - > -) + x(co + 01x + cox? - τ" 7) =0. 


Equating coefficients of corresponding powers of x shows that 2ce = 0, 


6c3 + co = 0, 12c4 + ει = 0, 2065" + co = 0, . . .. Using the initial condi- 

tions in the expressions for y and y’ then shows that co = 1 and c; = 0, 

which when combined with the above results gives cz = 0, cz = —4, ca = 0, 

c5 = 0, . . .. The power series expansion corresponding to Eqn (15-39) is 
I 1-4 1.4.7 

thus y= 1 ay ee ao rT χϑ +--+ with the (7 + 1)th term 


having the form 
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(—1)8 1.4.7. ..@Gn — 2) J 
(3n)! 

Using this general term and applying the ratio test to the series shows 
immediately that it is in fact convergent for — 0 <x < οὐ. 

Had initial conditions been specified at the point x = xo with xo 4 0, 
then in place of the power series in x that has just been used it would have been 
necessary to write 

y =o + αἷὐὰ — X0) + Co(x — χο)ἢ +05, 


and thereafter to proceed as before. 

Both of these methods will obviously fail if the function y(x) cannot be 
expanded in a power series about the point at which the initial values are 
specified. Since a function which is capable of expansion in a power series 
about a point is said to be analytic at that point, the above two methods will 
both fail should the function described by the differential equation be non- 
analytic at the point at which the initial values are specified. A modified 
method due to Frobenius must then be used, though we shall not discuss it in 
detail here. In essence it is similar to Method 2 above, though a more general 
solution is sought in the form 


3 (n > 1). 


οϑ 


pee) Gx 


r=0 
where the coefficients cy and the constant m require determination. Full 
accounts of the method are to be found in almost any more advanced text on 
differential equations. 


156 Runge—Kutta method 


The series methods for the solution of differential equations described in 
the previous section provide analytical techniques for the determination of the 
numerical behaviour of a particular solution. If the rate of convergence of the 
series is poor, or if a non-linearity in the equation precludes the use of such 
methods, some other approach must be used to obtain an accurate numerical 
solution. 

The very useful and flexible numerical method that we now describe was 
first introduced by C. Runge at the turn of the century and subsequently 
modified and improved by W. Kutta. It is essentially a generalization of 
Simpson’s rule and it can be shown that the error involved when integrating 
a step of length Ax is of the order of (Ax)°. The method 1s simple to use and, 
unlike the predictor-corrector method outlined in Chapter 13, allows 
adjustment of the length of the integration step from point to point without 
modification of the method. 

We suppose that x and y assume the values Xn, Yn after the nth integration 
step in the numerical integration of 

d 


~ = f(x, y). (15-40) 
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Then the value yn+1 of the dependent variable y that is to be associated with 
argument Xn+1 = Xn + Ax is computed as follows. 
Use an integration step of length Ax and let 


ky = f(xn, yn) . Ax 

ke = f(xn + $Ax, yn + ἐκ). Ax 
kg = f(xn + $Ax, vn + ἐκ). Ax 
ka = f(xn + Ax, yn + ks). Ax 
Ay = $(ki + 2ke + 213 + ka), 


then the value yn+i1 of y corresponding to x = xn + Ax is determined by 


(15-41) 


Yast = Yn + Ay. (15-42) 
Example 15:14 Let us again determine the value y(0-5) given that y’ = xy, 
with γ(0) = 1 and Ax = 0-1. In this simple example, already used to illus- 
trate Euler’s method and its modification, we have f(x, y) = xy. As we must 
anticipate an error of the order of (0-1)> we shall work to five decimal places 
so that we may compare our solution with the exact result y = εἰσ, 


n Xn yn ff (xn, yn) ky ke ks ka Yn+1 ee” 

0 00 10 0-0 0-0 00050 0-00501 0-01005 1-:00501 10 

1 0-1 1:00501 0-10050 0-01005 0-01515 0-01519 0-02040 1:02020 1-00501 
2 0:2 1:02020 0:20404 0-02040 0:02576 0-02583 0-03138 1:04603 1-02020 
3 03 1:04603 0:31381 0-03138 0-03716 0-03726 0-04333 1-08329 1-04603 
4 0-4 1:08329 0-43332 0-04332 0-:04972 0-04987 0:05666 1-13315 1:08329 
5 0-5 1:13315 1-13315 


Comparison of the results of column three with the analytical solution 
y = e!* show that it is in fact accurate to five decimal places, so that in this 
case our rough error estimate was too severe. 


The superiority of the Runge-Kutta method over the Euler and modified 


Euler methods is clearly demonstrated if the Runge-Kutta solution is com- 
pared with the previous solutions. This improvement is uniformly true and 
not just in this instance, since it may be shown that the errors involved in 
the Euler and modified Euler methods are, respectively, of the order (Ax)? 
and (Ax)3. No discussion will be offered here of the more subtle finite differ- 
ence methods that may be used to provide integration formulae having 
extremely high accuracy. 

The Runge-Kutta method readily extends to allow the numerical solution 
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of simultaneous and higher order equations. Suppose the equations involved 
are 


d 
τ =f y,2) 
(15-43) 
“τὴς g(x, y,2) 
dx oy 
subject to the initial conditions y == yo, Z = Zo at x = Xo. 
Then, at the nth step of integration, setting 
Κι = f (xn, Yns Zn) . Ax 
ko = f(Xn + 4 Ax, Yn + $k, ZH + 4K) ᾿ Ax (15 44) 
kg = f(Xn + 4Ax, Yn + ska, Zn + 4K2) . Ax 
ka = f(xn + Ax, Yn + ks, Zn + K3) : Ax, 
and 
Ki = g(xn, Yn, Zn). Ax 
Ko = g(xXn + $Ax, yn + $k1, Zn + ἐκ). Ax 
. (15-45) 


Κϑ = g(xn + $Ax, ya + tke, Zn + $K2). Ax 
Kg = 2(Xn + Ax, Yn + ks, Zn + K3) : Ax, 


we use the following formulae to compute Ay and Az: 


Ay = 4(ky + 2ke + 2k3 + ka) and Az = 4(Ki + 2Ke + 2K3 + Κι). 
(15-46) 


The values of y and z at the (n + 1)th step of integration are yyi1 = yn + Ay, 
Zn+1 = Zn + Az. 

These results may also be used to integrate a second order equation by 
introducing the first derivative as a new dependent variable. Suppose 
γ' — 2y' + 2y = 0 with γ(0) = y’(0) = 1. Then setting y’ = z, the second 
order equation is seen to be equivalent to the two first order simultaneous 
equations γ΄ = z and z’ = 2(z — y), with (0) = 1 and z(0) = 1. Applying 
formulae Eqns (15-44) to (15-46) with Ax = 0:2, f = z, and g = 2(y — z) in 
order to determine y(0-2) we find 


Κι = 0-2, ki =0 
ke = 0-2, Ke = —0-04 
k3 = 0-196, K3 = —0-048 


Κα = 0-1904, Ka = —0-0976, 
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so that Ay = 0-19706 and Az = A(y’) = —0-04560. Hence (0:2) = 1-19706 
and γ΄(0:2) = 0-95440, which are in complete agreement with the analytical 
solution y = e* cos x, 


15:7 Oscillatory solutions 


Although we have already discussed the general method of solution of second 
order differential equations with constant coefficients, their importance in 
practical applications merits special mention when the inhomogeneous term 
is periodic. The second order differential equation 
2 

tbs tex =f (15-47) 
characterizes many important physical situations. For example, when a 
represents a mass, b a damping force proportional to velocity, and c a 
restoring force, Eqn (15-47) could represent a mechanical vibration damper. 
Alternatively, if a represents an inductance, ὃ a resistance, and c a capacitance, 
Eqn (15-47) would describe an R-L-C circuit and, indeed, many other 
situations are characterized by this simple equation. 

By analogy with a mechanical system in which f(r) represents the input 
driving the system, the inhomogeneous term is sometimes called the forcing 
function. \t is the inhomogeneous term that gives rise to the particular integral, 
and we again remark that part of the general solution is attributable solely 
to the function f(t). 

We shall confine attention to the following particular form of Eqn (15-47): 


a 


γ' + 2ζγ' + ὧδν = asin wt, (15-48) 


where a is called the amplitude of the forcing function sin wt, which has 
frequency ὦ rad/s and period 27/w. The number ¢ is usually called the 
damping of the system described by Eqn (15-48), and Ὁ is then called the 
natural frequency of the system. Several cases must be distinguished and 
first we assume that ζ 4 0 with C2 < Q2. Then, setting wo? = Q? — ζῇ, the 
roots of the characteristic equation P(A) = 42 + 2ζλ + ὧδ = 0 become 


A= —C + iwo. In terms of this new notation, the complementary function 
Ye can be written 
ye = Ae sin (wot + 8), (15-49) 


where A and ε are arbitrary constants. The particular integral yp can be 
expressed in the form 


Vp = Psin wt + Qcos ot, (15-50) 
where 
δ δ ἀεὶ 
P= a( w?) io ΘῈ 2ζαω 


(Q2 — 2)? + 4ζξω3 (Q2 — m2)? + 4ζξωΣ 
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In this context ε is usually called the phase angle of the solution. 
Simple manipulation shows that yp, may be expressed in the alternative 
form 


a 


where ὃ = arctan (Q/P). The complete solution can then be written 


. a . | 
y = Ae “sin (wot + €) + ΚΩΣ — οὗν + 42wye sin (wt + 0). (15-52) 


If ¢< 0, then as time increases the influence of the complementary 
function on the complete solution will diminish. In these circumstances, 
after a suitable lapse of time, only the particular integral will remain and will 
describe what is often called the steady state behaviour. This is to be inter- 
preted in the sense that the complementary function, which essentially 
describes how the solution started, has ceased to influence the solution. It is 
for this reason that the complementary function is often said to describe the 
transient behaviour of the solution. 

If we agree to call a solution stable when it is bounded in magnitude for 
all time, it can be seen from the form of yp in Eqn (15-51) and our discussion 
of ye, that the solution Eqn (15-52) is stable provided ¢ > 0. 

Examining the steady state solution Eqn (15-51) for a stable equation, we 
notice that the sine function has an amplitude A(w) which 1s frequency 
dependent: 


a 


MO) = Gt — a? + ape 


(15:53) 

It is readily established that the denominator of A(w) has a minimum 
when ὦ = We, where we? = 0? — 222. Hence the maximum amplitude 
Amax attained by the steady state solution must occur when w = we, and it 
has the value: 


a a 
max ~~ ζ2 " 


(15-54) 
2¢0 (1 = 


()2 


The frequency ὡς at which Amax occurs is called the resonant frequency and 
it can be seen that when there is zero damping (¢ = 0), the original Eqn 
(15-48) describes simple harmonic motion for which Amax > © as ὦ > Q, 
which is then the natural frequency of the system. That is to say 2 is the 
frequency of oscillations when the forcing function is removed. 

If 0< €< Q the complementary function is oscillatory, and physical 
systems having a damping ¢ in this range are said to be normally damped. 
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Fig. 151 Amplification factor A(w) as function of normalized damping £/Q and 
nondimensional frequency w/Q. 


If, however, ζ > Q the complementary function or transient solution becomes 
Ve = Cie" +. Coe*?", 


where ky = —€ + (ζῇ — ΩΣ and kp = —€ — (22 — Q2)!/2, and is no 
longer oscillatory. The associated physical system is then said to be over- 
damped. 

A critical case occurs when ζ = Q, for which the complementary function 
becomes 


Ve = (Ci + Cathe ™. (15-55) 


In these circumstances the associated physical system is said to be critically 
damped. 

The amplitude A(w) is essentially an amplification factor for the forcing 
function input a sin-w? and it is convenient to summarize the results of this 
section by constructing a graph of A(w) versus w for different values of the 
damping ζ. | . 
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This is illustrated in Fig. 15-1 for a representative range of values of ζ. 
The reason for the infinite amplification factor at ὦ = ὡς in the case of zero 
damping may be readily appreciated by solving the equation 


γ΄ + OFy = asin Ot. 
The complete solution here is 


y = Asin (Qt + ε) -- a t cos Qt, (15-56) 
and although the complementary function is finitely bounded for all time, the 
particular integral is not. A differential equation of this form could, for 
example, describe the motion of a simple pendulum excited by a periodic 
disturbance at exactly its natural frequency. The disturbing force would 
always be in phase with the motion and so would continually reinforce it, 
thereby causing the amplitude to increase without bound. 


15:8 Coupled oscillations and normal modes 


A great many physical situations can be described approximately in terms of 
coupled oscillatory systems, each having properties of the type discussed in 
the previous section. Such is the case in electrical circuits containing induc- 
tance, in many mechanical oscillation problems, and in certain forms of 
interacting contro] system. | 

A systematic examination of these problems is not appropriate here so, 
instead, attention will be confined to a typical but simple form of the problem 
containing neither damping nor inhomogeneous terms in the equations. 
Expressed in more physical terms, we shall confine attention to coupled 
simple harmonic type equations involving no forcing functions. 

The following is a typical mechanical vibration problem. We suppose that 
a light elastic string stretched between two fixed points A and B has masses 
3m and 2m attached to it at points P and Q, where AP = /, PQ = /, QB = /. 
The tension in the string is km/ where k is the elastic constant of the string. 
Our task will be to determine whether there are preferred frequencies and, if 


3m ᾽ 


7 


a 


Fig. 15:2 Elastic string and mass system. 
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so, the manner of vibration of the system when only small displacements are 
to be considered. We shall also determine the subsequent motion of the system 
if initially only the mass 3m is given a small lateral displacement d and is then 
released from rest. 

The small lateral displacements of masses 3m and 2m will be denoted by 
x and y, respectively (see Fig. 15-2). 

Neglecting gravity and using the fact that the system is non-dissipative, 
energy considerations lead easily to the equations of motion 


d2x 
3m m— + km(2x — y) = 0 
and 
d2 
2m τς + km(2y — x) = 0. 


Thus we must consider the solution of the simultaneous differential 
equations 


d2x 
and 
d? 
ἘΣ t+ 2ky — kx =0, 


Now although the use of matrices can easily be avoided when solving this 
second order system of equations as they are usually termed, it will be more 
instructive to utilize them. Accordingly, defining the matrices X, M, and A 
to be 


᾿ i ἢ 2k —k 
xX — 9 Μ = > A = 3 
γ 0 2 —k 2k 


and defining d?X/d?? by the expression 


d2x 
d2x | di? Ἀνὰ 
ae , we see that we must solve the matrix differential equation 
d2y 
df 
d2X 
Ma + AX = 0. 


This now bears a striking resemblance to the familiar simple harmonic 
equation encountered when dealing with simple pendulum problems. Indeed, 
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the resemblance becomes even closer if we notice that as det M = 6 + 0, we 
may pre-multiply the matrix differential equation by M~! to obtain 
ΟΧ 


Ae + KX = 0, with K = M-!A. 


To find if there are preferred frequencies and periodic solutions let us set 
X =B sin (wt + ©), where ὦ is a frequency, ¢ is a phase, and B is the 
constant column vector 


(J 


It is now necessary to find relationships that exist between w, δι, and be. 
Using the equation X = Bsin(w/ + ¢), it follows by differentiation that 
d2X/dt2 = —w?B sin (wt + ¢€), so that substitution into the matrix 
differential equation gives the result 


(KB — w?B) sin (wt + ε) = 0. 


Now the scalar multiplier sin (wt + ε) is not identically zero, so we may 
cancel it and, after taking out the constant post-multiplier B as a factor, we 
arrive at the matrix equation 

(K — w7I)B = 0. 


Notice that as B is not a scalar it may not be cancelled from the result. 
This expression is simply a pair of homogeneous simultaneous equations for 
the elements δι and δὲ of vector B, and from our previous study of such 
equations we know that a non-trivial solution will only be possible if the 
determinant of the coefficient matrix vanishes. That is, δὶ and bg may be 
determined, not both zero, provided that 

[Κ — ΟἹ] = 0, 


This is usually called the characteristic determinant of the system. 
Returning to the data of the problem we see that 


2k γ᾿ 


x O 
μ-: = ἢ , Showing that Καὶ = ΜΊΑ = 
0 3 —tk Κα 


2 


and hence the characteristic determinant of the system is 
&k—w% —tk 
—tk k—o 


This is just an equation for w?. Expanding the determinant we arrive at the 
characteristic equation of the system: 


wt — kw? + 1k? = 0. 


= 0. 
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Solving this for w? we find that the characteristic determinant will vanish, 
so that the system will give rise to values 5; and δε not both identically zero, 
only when w? = @,? = k(5 — ν 7)6 or w? = wo? = k(5 + +/7)/6. wi and 
ὧν are called the natural frequencies of the system as they describe the only 
purely sinusoidal oscillations that occur naturally in the system. 

To find the values of 6; and be corresponding to these natural frequencies 
we return to the equation 


(K — w21B = 0 


and solve for B, first with w? = w,? and then with w? = ὠς. 
We begin by setting w? = w 1? to obtain the matrix equation 


—| 
(== k —1k by 


6 
— 0, 
Ι 7 
—tk ( av )k be 
6 
which, as k + 0, reduces to the two scalar equations 
—| 7 
(=~) bi ths = 0 
6 
and 
Ϊ 7 
ἘΠ ( ἘΝ ) bx £46) 


Solving either of these homogeneous equations which, because of the 
manner of determination of w? are of course compatible, we find that b2) 
= 3b,/(1 + 4/7); the superscript 1 indicates that these are the values 
assumed by δὶ and b2 when w? = a4’. 

As the equations are homogeneous they only determine the ratio 
by): bo, and the value of either 51 or bg") may thus be assigned arbitrarily. 
Accordingly we shall choose to make 6; = 1, when | 


] l 
BY) = 3 and, consequently, X(t)') = 3 sin (wit + εἰ), 
ea ἜΤΙ 


where the superscript 1 indicates that these are the forms assumed by B and 
X(t) when w? = w1?. 
A similar argument in the case w? = w2? shows that 
Ι Ι 
BY) = 3 and, X(r)'?) = 3 sin (wet + ε5). 


L479 Caan 
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Thus X(t) and X(¢) describe the purely sinusoidal forms of disturbance 
that are possible when w? = w1? and w? = we?, respectively. Both of these 
are possible solutions to the original system of differential equations and, as 
the differential equations are linear, the general solution X(f) must be of the 
form 


X(t) = aX(t) + BX(1)?, 


where ἃ, # are arbitrary constants. In more advanced works the solutions 
Χ( Ὁ and X(t) are called eigensolutions and the numbers w,? and w»2? are 
then given the name eigenvalues. 

The solution X(f) is the matrix equivalent of the complementary function 
encountered at the start of this chapter. To find the solution satisfying any 
given initial conditions it now only remains to determine the constants « and 
β and the arbitrary phase angles εἰ and δ. 

To complete the problem in question we now make use of the fact that 
the system starts from rest at time t = 0 with x = ὦ, y = 0. In terms of X(t) 
this yields the initial conditions 


dx d 
— ΞΞΞ d =< . 
dt |:=o 5 χὰ Η 
Applying the first of these conditions to X(t) we obtain 
dX 
oT = 0 = «017B™ cos εἰ + Bwe?B? cos ες, 
t=0 


showing that εἰ = ξὸ = {π᾿ The second condition gives 


d 1 I 
x)=} [=a] 3 [+6] 3 
rears, baat 

Hence, 

d=a+f 

ea 3a 3B 

ie te Sy alle en 7 
and so 

4 Vvhd y= 6d 

ΣΝ (Ts 9/6 


In terms of these constants α, 8 the solution to the explicit initial value 
problem posed at the start of the section is 


X(t) = «B™ sin (@it + ἐπ) + BB sin (wet + 477). 
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The roles played by the phase angles εἰ and δ are most important since 
they serve to adjust the time origins of the eigensolutions X(t)" and Χ() (2 
at the start of the prescribed motion. The constants « and β are just scale 
factors. | 

The four constants α, β, ει, and €2 are, of course, the four arbitrary con- 
stants that our previous work has led us to expect to be associated with two 
simultaneous second order equations, though the manner of their appearance 
here is perhaps slightly unfamiliar. 

In vibration problems it is common to refer to the fundamental eigensolu- 
tions X(t)? and Χ( 2) as the normal modes associated with the problem. 
This arises on account of the fact that each solution of this kind is a pure 
sinusoid disturbance describing a specially simple and characteristic mode of 
vibration. Thus, for example, in the first mode X(t)”, vibrations will be of 
the form 


. 3A 
x = Asin (wit + 47), y= (γΞ 5) sin (wit + 47), 


with A an arbitrary constant. 
The apparent choice of sign that is possible for both ὦ and we is im- 
material, since it may be absorbed into the determination of «, β, δ1, and ες. 


15:9 The Laplace transform 


We have already encountered one operator method for solving linear differ- 
ential equations in the form of the operator D introduced earlier. We saw 
there how in some respects the operator D could be manipulated as though it 
were an algebraic quantity, and with the introduction of four basic rules, the 
solution of certain standard forms of differential equation became a matter 
of routine. However, when more complicated equations require solution the 
operator D method is less successful and a more powerful operator method 
must be introduced. 

The operator method that we now outline has the required algebraic 
simplicity when applied to elementary differential equations of the type 
already discussed, and yet it is capable of direct extension to deal with very 
general forms of linear differential equation. As this extension requires the 
use of advanced ideas from the theory of functions of a complex variable, 
our treatment will be restricted to the simpler applications of the Laplace 
transform which only utilize real variable methods. 


DEFINITION 15-4 (Laplace transform) The Laplace transform of the function 
f(x) is usually denoted either by #[ /(x)], or by &[/], and it is defined by 
the integral 


LIf(] = [ ᾿ ἐ-Ῥαξ() dx. (15-57) 


. 
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The transform “[/] will obviously only exist if f(x) is a function for 
which the integral in Eqn (15-57) exists. Essentially, the transform is a rule 
for assigning to each function f(x) for which Eqn (15-57) exists, a unique 
function ¥[f] which is a function of the transform variable p. The two 
functions f(x) and “[f(x)] are called transform pairs. 


The success of this method lies in the fact that the manipulation of the 
algebraic equation that results when a differential equation is transformed 
proves to be easier than the manipulation of the original differential equation. 
When the transformed dependent variable has been found algebraically, an 
appropriate inverse operator must then be used to obtain the solution as a 
function of x, and it is at this point in the general theory of the Laplace 
transform that advanced complex variable methods need to be used. How- 
ever, in the more elementary problems studied here it will suffice to use 
straightforward algebraic methods to simplify the transformed solution to 
the point at which the solution as a function of x can be recognized by the 
use of a simple table of transform pairs. 


Table 15-1 Table of Laplace transform pairs 


f(x) L[f(x)] 
] : (p > 0) 
p P 
J 
x P (p > 0) 
ni 
x pri Pr) 
ax l ( >> 
e = p> a) 
Neate eile | ( > ) 
x"e τ - αγπτὶ p>a 
m 
sin mx pt we (p > 0) 
cos mx awe (p > 0) 
ο΄ ας sin mx ee (p > —a) 
(p + a)? + m 
ε΄ COS mx Balad (p > —a) 


(p + a)? + m 
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Let us now determine the Laplace transform of f(x) = e®*. Using Eqn 
(15:57) and setting f(x) = e%%, we obtain 

1 
p-a 
Now as this integral will only exist if p > a, it is necessary to add that the 
Laplace transform of e%* is (p — αὐ} when p > a. It is a simple matter to 
establish the list of Laplace transform pairs in Table 15-1. 

In order to apply the Laplace transform to differential equations the 


transform properties of derivatives must be determined. Let us first do this 
for dy/dx by setting f(x) = y’ in Eqn (15-57) to obtain 


“101 = | yerrdx = [ eb? d(y). 


Integrating by parts gives 


oo 
0 


Lf] - | e-Pt et dx = 


Ly] = yer 


+ p } ye-Pt dx 
0 0 


and so, provided ye~?% — 0 as x -- 00, we see that 


L{y'] = —y0) + p#[y]. (15-58) 
Using the same method it is easily established that 

Lly"] = —y'O — py) + PLY] (15-59) 

Ly") = —y"O) — py’) — p®y0)  ρ5.2}0] (15-60) 
and, in general, 

Lym] = —yVO) — py'"-2(0) — - - - — p™1y(0) + p"F[y]. (15°61) 


Let us now apply the Laplace transform to obtain the solutions of two 
typical constant coefficient second order equations. 


Example 15:15 First we consider the simple differential equation y” — 3y’ 
+ 2y =e-* subject to the initial conditions γ(0) = 1, y’(0) = 0. Using 
transforms Eqns (15-58) and (15-59) together with the fourth entry in the 
table of Laplace transforms we arrive at the transformed differential equation 


1 
—y'(0) — py0) + p?-LLy] + 30) — 3p fy] + 2.510] = Ἐπ 
Making use of the initial conditions and simplifying then gives 


p2 —2p—2 
(p p + 2)L[y] ea 


It should be noticed that the multiplier of &[y] has the same form as the 
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characteristic polynomial, though with the operator p replacing 4. Hence the 
transform #[y] of the solution is given by 


Ἐν ek, Jae a 
(p + 1)(p? — 3p + 2) 


Using partial fractions it is easily established that we may write 


“Ὁ τί) - 3 (5=3) +e (i) 


Again referring to the fourth entry in the table of Laplace transforms 
shows that the fractions in brackets are simply the transforms of οὖ, e2*, 
and e-*, respectively, so that the complete solution must be 


Ly] 


a eee 
2 3 6 


Example 15:16 Finally, let us suppose that y” — 2 y’ + y= e* and that 
y(O) = 0, y'(0) = 1. Then transforming the equation as before gives 


1 
—y'(0) — py0) + »Ξ Hy] + 2y(0) — 2pL[y] + Ly] = ——,> 


p-l 
which on using the initial conditions becomes 
δ. 2ρ Nya 
(p? — 2p + Ly] = 
»-- 
Hence, dividing by (p — 1})5 and using partial fractions we arrive at 


gs νος 
(p—13 (-- ὩΣ (p—D® 


so that identifying the transform of χήρα shows that the complete solution is 


L ly] = 


y = xe* + hx%e%, 


Other constant coefficient linear differential equations with simple in- 
homogeneous terms may be solved in a strictly analogous fashion. 

The transform method is particularly convenient when dealing with 
simultaneous equations as we tllustrate by our last example. 


Example 15-17 We shall determine the solution of 


dy dz 
dx A es dx ee) 


given that γ(0) = 1, z(0) = 0. 
Introducing the Laplace transforms #[y] and #[z] of the dependent 
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variables y and z, the transformed equations become 
—yO) + p#[y] = LLy] + 5212] 
—2(0) + plz] = —Lly] — 3292]. 


Using the initial conditions and simplifying we find that 
(p — 1). 10] — 5-[z] = 1 (15-62) 
Llyl + (p + 3.12] = 0, 


which have the solutions 


pt+3 —| 
ΡΤ] = —————__ Lf fz] = ———_——__ 15:63 
Westy =e acess (15-63) 
If these transformed solutions are written in the form 
(p +1) 2 1 
Ly) = ————qumc—m ie qeum and ΙΖ] = ---.---...-...... 
VI (pt)?4+1 (Έ)3 1 [7] (» 15 ΕἘΊῚ 


the table of Laplace transforms may be used to identify the solutions as 
being 


y =e-%(cos x + 2 51η x), z= —e-“ sin x. 
When dealing with several simultaneous differential equations it is often 


advantageous to introduce matrices to assist with the determination of the 
transformed solutions. Thus, if we define 


—] —5 L(y] Ι 
PP cok HE «ἢ 
᾿ (p + 3) L£ [z] | 0 
Eqns (15-62) can be written in the compact matrix form 
AL = B. 


Computing the inverse matrix A7! then gives 


cles l i 5 
ΟΣ +2p4+2)| -1 ο»-- 


so that the equation L = A~!B becomes 


aaa = Ι ἕ +3) 5 ] 

Liz} (pP?+2p+2)| -ἰ φρ-- ὉΠ10 

from which Eqns (15-63) immediately follow. Thereafter the solution pro- 
ceeds as before. 
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PROBLEMS 
Section 15-1 


15:1 


15:2 


15:3 


15:4 


15:5 


Find the characteristic polynomials and complementary functions of the 
following differential equations and, where initial conditions are given, find 
the appropriate particular solution: 

(a) γ΄ + Sy’ — 14y = 0; 

(0) γ΄ --γΞεῦ; γΞ 1, γ =Oatx =0; 

(c) νγ΄ ἀν + 3y = 0; 

(4) γ΄  5γ΄ + 2y’ — ὃν = 0; 

(ὁ) γ" + Ty” + 12y’ = 0; γ- 0, γ' =9, y” = —39 at x = 0; 

() γ Ἐν τον yt 

Bye Ξ 3 ay Pay = 


By using the definition of linear dependence state whether the following sets 
of functions are linearly dependent: 


(ay xP, xx: (b) cos x, —3 cos x, 9 cos x; 
(c) cosh 2x, sinh 2x, 1; (d) cosh? 3x, sinh? 3x, 2; 


Obtain the general solution of γ΄ — 6y” + 11γ΄ — 6y = 0, and by finding 
the Wronskian of its three constituent functions prove that they are linearly 
independent. 


By forming the general solution and eliminating the arbitrary constants by 
differentiation, determine the differential equations that have the following 
sets of functions as linearly independent solutions: 


(a) οὔ, ς 5; (Ὁ) e%, xe, x2e7; (c) 1, x, 657. 


Find the general solutions of the following differentia] equations: 
(a) y—-y ty —y=0; 

(b) γ΄ + y’ +y=0; 

(c) γ΄ — 3ay” + 3a*y’ — ay = 0; 

(ὦ) yi¥ + 2y" + y = 0; 

(e) yi¥ + 2y” + Oy = 0. 


Section 15:2 


15:6 


Determine the general solutions of the following differential equations using 
both the method of undetermined coefficients and the operator D method 
with Rules 1 to 4: 

(a) γ΄ + 2y —3y=x?4+x41; 

(6}}} ΕΞ 55}; ΘΙ deta ee 

(c) γ΄ + 2γ' + y = e*; 

(4) y” + 4y + Sy = 6e7(2 cos 2x + sin x); 

(e) γ΄ — y = 2e*; 

(Ὁ γ΄ + 4y = cos 2x; 

(g) γ΄ + 9y = sinh x; 

(h) γ΄ + 2y’ + Sp = ς( + 2%); 

(i) ν᾽" + 3y” + 2y = cos 3x; 

( γ΄ —y — by ΞΞ εἴ + sin x; 

(k) γ΄ + 4° = x - εἴ, 
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Section 15:3 


15-7 Obtain the general solutions of the following differential equations by using 
the method of variation of parameters: 
(a) γ΄ — y = χοῦ; 
(0) γ΄ —2y + y=xsinx; 
(c) ν΄ — 2y’ + 2y = 4e* sin x; 
(ἃ) γ΄ ay PS χο-. 


Section 15:4 
15-8 Obtain the general solutions of the following simultaneous differential 


equations: 

ω * = y, © = x: 

b) Faxty, % = 2x — y; 

() Faxtyts SY = 2p — 4x — 3». 


159 If dx/dt=y+z, dy/dt=x+z, dz/dt= x+y, obtain the general 
solution and hence find the particular solution satisfying the initial conditions 
x=y=1,z =Oatr=0. 


Section 15:5 
15:10 Obtain the series solutions of the following differential equations: 
(a) ν΄ + ἄν = x? with γ = —4, yh =2atx=0; 
(0) ν΄ + y = 3x? + 6x + I with y = 2 at x = 0; 
(c) xy” + yy + 2xy = Owithy = 1, y =Oatx =0; 
(4) + y= x* with y = Oatx = 0. 


Section 15-6 

15:11 Use the Runge-Kutta method with Ax = 0:1 and, working to four decimal 
places, determine y(1) given that y’ = (x? + y)/x with y(0:5) = 0-5. Compare 
your results with the exact solution y = 3x + x. 

15:12 Use the Runge-Kutta method with Ax = 0-2 and, working to four decimal 
places, determine y(1) given that y’ = y + e-* and y(0) = 0. Compare your 
results with the exact solution y = sinh x. 

15:13 Use the Runge-Kutta method with Ax = 0-1 and, working to four decimal 
places, determine y(0-3) piven that γ΄ — 3y’ + 2y = 0 with y(1) = 0 and 
γ (4) = 0. Compare your results with the exact solution y = 2e* — οὗν, 


Section 15-7 
15:14 The equation of motion of a forced oscillation is 
¥ + 2} + Sy = 10 sin ot. 


Find the complete solution, indicating the difference between the transient 
and steady state terms. Find also the maximum value of the amplitude of 
the steady state oscillation that may be obtained by varying ὦ. 
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15-15 Sketch the variation of the phase angle ὃ of the particular integral occurring 
in Egn (15-51) as a function of the normalized excitation frequency /Q, 
for the cases ζ = 4, € = 1, and ¢ = 2. 


15:16 Derive an expression for x in the case of a critically damped oscillator for 
which 


X + 2nx + n2x = 0, 


where x = u and x = s at time ¢ = 0. Show that if this equation describes 
the motion of a particle, then it will come to rest when x = u/ne if s = 0. 


15-17 When ©? = 22, the general solution of the damped harmonic motion 
described by 


X+ 20x + 02x = 0 


x =e (4 cos wot + Bsin wot), (A) 
where wo? = 922 — ζ3, Deduce that the extrema of x occur when 
tan wot = (Buwo — CA)/(Awo + CB). (B) 


Denote the positive solutions of this equation by 

wot = 00 + rz, 
where r = 0, 1,2,. . ., and δὸ is the smallest positive angle satisfying (B). 
Thus, defining the sequence of times {f;} by 

tr = (do + rz)/woo, r=90,1,2,..., 
and the corresponding sequence of displacements {x,} by setting ¢ = ἐν in 
(A), prove that 

χα χε = exp (— b 2/0). 


This establishes that the ratio of the amplitude of successive oscillations 
decreases by the constant factor exp (— ζπίωο). The constant C/o 15 called 
the logarithmic decrement of the oscrifations. 


Section 15-8 
15-18 Repeat the solution of the vibration problem in Section 15-8 without the 
use of matrices starting from the assumption that 
x= asin(wt + e1) and y= Bsin (wt + 2). 


15-19 A thin light elastic string is stretched between two fixed points A and B and 
unit masses are attached to it at points P and Q, where AP = PQ = QB. 
The equations of motion determining small lateral displacements x and y of 
the masses at points P and Q are 


X+2x-—y=0 and j+2y—x=0. 
Determine the subsequent motion of the system if it is initially released from 
rest at time ¢ = 0 with x = a, y = 3a. 
15-20 Repeat Problem 15-19 subject to the initial conditions that the system is 
released from rest at time ¢ = 0 with x = a, y = —a. 


15:21 In a certain vibration problem, displacements x, y, and z are described by 
the system of equations: 


=" on 


Bia i ee a a τ 
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dx 
maa + lat dx — by = 0, 


2 
ἘΣ + (a + 2δὴν — bx — bz = 0, 


d2z 


maa t at δ)2 — by = 0, 


in which a, 6, and m are constants. Express these differential equations in 
matrix form, and by writing X(t) = Bsin (ωΐ + ε), with B an arbitrary 
three element column vector, show that the system has three natural fre- 


quencies «17, we”, and m3? and find their values. Use your results to deduce 
the form of the three normal modes. 


15:22 Verify the table of Laplace transform pairs given in the text. 
15-23 Verify the Laplace transforms of d?y/dx? and d°y/dx° given in the text. 


15:24 Use the Laplace transform to obtain the solutions of the following differ- 
ential equations subject to the stated initial conditions: 


(a) γ +2y=x if yO) = ἃ; 
(Ὁ) γ΄ —y=sinx if y(0)=0; 
(c) γ΄ +4y=sinhx if γί) ΞΞ 0, γ(0) = 1; 
(4) γ΄ Ἡ 2y4+5y=e7 if yO) =0, yO) = 0; 
(6) γ΄ —y=xe* if yO) = 0, γ(0) = 3; 
(ἢ y+ 3y" + 3νγ +y=x if yO) = 0, yO) = 1, y’O) =0. 
15-25 Use the Laplace transform combined with matrix methods to solve the 
following simultaneous differential equations: 
dz 


τς - -Ὁ if WO =1, 20 -- 2; 


dy _ 
(a) FH 22 


a ee ee 7 Ν = 
ye = 3y JF a z if γί) = 0, 2(0) = 1. 


Fourier series 


16:1. Introductory ideas 


Let us begin by taking a first look at Fourier series intuitively, postponing 
until later in the chapter any serious attempt to justify the mathematical 
operations we shall need to perform. The fundamental idea underlying 
Fourier series is that all functions f(x) of practical importance which are 
defined on the interval —7 < x < 7 can be expressed in terms of a convergent 
trigonometric series of the form | 


iy = “ + ¥ (a, cos nx + b, sin nx), (16-1) 
n=1 


in which the constant coefficients a,, 5, are related to f(x) in a special way. 
The apparent restriction of f(x) to the interval [—7, π] 1s unimportant, 
since an elementary change of variable will always reduce an arbitrary 
interval [a, Ὁ] to [—7, 7]. 

Notice here that because of the periodicity properties of the sine and 
cosine functions, the right-hand side of (16-1) must of necessity be periodic 
with period 27. This implies that the best we can expect of such a representa- 
tion is that, at each point of the interval [—7, z], the trigonometric series 
has for its sum the function f(x). Naturally, although the trigonometric 
series will assign functional values to f(x) for αἰ real x, it does not follow 
that these need agree with the actual functional values of f(x) outside the 
fundamental interval [—7,7]. In fact, the series will provide a periodic 
extension of the functional behaviour of f(x) over the fundamental interval 
[—7, 7] to every interval of the form [(2r — 1)z, (2r + 1)z], in the sense that 
Sx) =f + 2rz) forr = 0, +1, +2,.. .. 

To deduce the relationship between f(x) and the coefficients a,, ὁ, let us 
first reinterpret results (8-34) to (8-36) of Chapter 8 in terms of definite 
integrals taken over the interval [—z,7]. We find at once that for any 
integers msn =0,1,.. ., 


| sin mx cos nx dx = 0 for all m, n, (16:2) 


Oform +n 
aform=n+~0, 


(16-3) 


qT 
| sin mx sinnx dx = 


— 7 


| 


SEC 16-1 INTRODUCTORY IDEAS / 701 


oe Oform~n 
[ cos mx cosnx dx = (7 form=n-~O0 (16-4) 
ex 27 form=n= 0. 


In mathematical terms, the facts expressed by Eqn (16-2) and by the 
first results of Eqns (16-3) and (16-4) are described by saying that the functions 
belonging to the system 


1, cos x, sin x, cos 2x, sin 2x,.. ., COSMX, SINNX,..., (16-5) 


are orthogonal over the interval [—7, z]. In words, these equations say that 
the product of any two different functions of this sequence when integrated 
over the interval [—7, z] will yield zero. 

The significance of the orthogonality property of system (16:5) is seen 
when Egn (16-1) is multiplied by cos mx and the result is then integrated 
over the interval [—7, 77]. We find that 


7 


| f(x) cos mx dx = = | cosmxdx + > (a, | cos mx cos nx dx 
— -π n=1 


+ b, [΄ ase mx sin nx dx), 

which on account of the above results immediately reduces to 
7 f(x) cos mx dx = πα, form =0,1,.. .. (16:6) 
Had Eqn (16:1) been multiplied by sin mx and the result been integrated 


over the interval [—7, 7], an exactly similar argument would have yielded 
the result 


| f(x) sin mx dx = ab,, form = 1, 2,.... (16-7) 
Thus we have found that for f(x) to have the trigonometric series represen- 


tation (16:1), we must define the constant coefficients a,, δι by the relation- 
ships 


a, = - {" f(x) cos nx dx (16-8) 
forn=0,1,..., and 

Ξ : { " f(x) sin nx dx | | (16-9) 
ΤΟ ΞΞ a oe 2 νιν: 


The coefficients a,, ὃ, so defined are called the Fourier coefficients of 
f(x), and the corresponding right-hand side of (16-1) is then called the 
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Fourier series of f(x). In our simple derivation of the form of the Fourier 
series of f(x) we have presupposed that f(x) is integrable, that termwise 
integration of (16:1) after multiplication by sin mx or cos mx is permissible 
and that the sum of the Fourier series for —7 < x < π is the function f(x) 
itself. These are major assumptions, and perhaps the best way to indicate 
that they need questioning is by considering some typical examples of Fourier 
series. However, before proceeding with this plan let us first make some 
general comparisons between Taylor series and Fourier series. 


The idea that a function may be represented by its Taylor series has 
already been discussed at some length, and it was seen in Chapter [2 that 
for a function f(x) to be so expressed it needed to be infinitely differentiable. 
This 15 a severe restriction on a function and is one which most functions do 
not satisfy. Even when Taylor’s theorem with a remainder is employed the 
function still needs to be differentiable a finite number of times and this, like 
infinite differentiability, certainly implies that the function must be continuous. 
Nevertheless, many functions used to describe important physical phenomena 
are discontinuous and so cannot be represented by a Taylor series. For 
example, the function used to describe the voltage behaviour with time in a 
circuit in which a switch is suddenly operated is discontinuous, as is the 
functional behaviour of the gas pressure across a shock front (Fig. 3-7). 


In principle, at least, Fourier series would appear to offer the possibility 
of representation of discontinuous as well as continuous functions, because 
whereas for a Taylor series expansion a function needs to be differentiable, 
for a Fourier series expansion it would appear that it only needs to be 
integrable. This assertion follows because in Eqn (7-19) we have already 
seen that the integration of piecewise continuous functions presents no 
difficulty, and so the Fourier coefficients a,, 5, can even be computed when 
f(x) is piecewise continuous. Naturally, we must examine the functional 
value which a Fourier series attributes to a point of discontinuity of the 
function which it represents, since at such points it is reasonable to expect 
the behaviour of the series to differ from that of the function itself. 


Another important feature of a Fourier series is that it offers a method 
of synthesis of a function in terms of simple harmonic components having 
periodicities which are sub-multiples of 27. This is particularly valuable when 
an oscillatory problem is being studied since, in effect, it describes the 
function involved in terms of the simple harmonic oscillatory modes which 
occur naturally in the problem. At this point it is appropriate to comment 
on the use of the term orthogonal in connection with the system of functions 
(16°5). This term is used deliberately on account of the similarity that exists 
between the resolution of an ordinary vector into three orthogonal com- 
ponents and the decomposition of a function into an infinite number of 
Fourier components which are orthogonal in the sense appropriate to 
system (16-5). There are important similarities and dissimilarities between a 


= eo 
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vector space of three dimensions and one with an infinite number of dimen- 
sions, but these will not be pursued here. 


For our first encounter with Fourier series we now choose to consider 
the following three examples. 


Example 161 Determine the Fourier series expansion of the function 
fQ) = 7? — x? for --π ξ χ ΞΞ π. 


Solution As f(x) = πὸ — x?, we see from Eqn (16-8) that the Fourier 
coefficients a, are determined by the integral 


] π 

a,=- Ι! (7? — x*) cos nx dx, 
7 — τ 

wheren = 0, 1,.... When ἢ = 0 this yields 


1 ¢7 4 
a =- | (7? — x*) dx = - 7. 
TT 


mabe | & 
ἘΠῚ 3 


For the case n + 0 we have 


7 1 f7 
a,=7 cosnx dx — ~ x? cos nx dx, 
vin 


— 7 


ΠΤ 


ω Ι (= COS NX + (n?x? — 2) sin 9 


n? n? 


7 σπ 


l 


4 
(—1)"*} τ: 


To determine the Fourier coefficients b, we must use Eqn (16-9) which 
shows that 


I {7 
δ, Ξε - | (7? — x?) sin nx dx, 
7 ais 


where n= 1, 2, .... Instead of evaluating this integral directly, let us 
divide the interval of integration and rewrite the result as the sum of two 
integrals. First we write 


1 f° 1 {7 
5, ΞΞ -- | (7? — x?) sinux dx + - | (7? — x?) sin nx dx, 
TT FL T WT Jo 
and then, setting x = —z in the first integral, this becomes 
] 0 ] π ᾿ 
b, = -- - | (7? — Ζ2) sin (—nz) dz + -- { (τ — x*) sin nx dx. 
WS or T Jo 


However, sin (—nz) = —sinnz, and the minus sign in front of the first 
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integral may be utilized to reverse the order of the limits of integration, so 
that finally we arrive at 


1 fz 1 fz 
δ, τΞ -- - | (7? — 27) sinnzdx + - | (7? — x?) sin nx dx, 
ΠΤ Tt Jo 


0 


for n= 1, 2,.. .. As the variable in a definite integral is only a dummy 
variable, we may replace z by x in the first integral to deduce that 


b, = Ὁ for all n. 


This result could, of course, have been obtained by direct evaluation of the 
definite integral for 5, using integration by parts, though the argument would 
have been more tedious. 

Inserting the Fourier coefficients a,, 6, into Eqn (16-1) then gives the 
Fourier series of f(x) = 7? — x? for —7 <<. x -ΞΞ π. The result obtained is 


2 ] l 
flo) = 50? +4 (c08 x — 55008 2x + 35 008 3x — ee 


i] n+1 
ΕΞ eer ; ᾿ 
n 


The relationship between the Fourier series representation of f(x) = 
a? — x? in the fundamental interval [—z, 7], the periodic extension it 
assigns to f(x) outside the fundamental interval, and the actual behaviour 
of f(x) both inside and outside the fundamental interval are illustrated in 
Fig. 16-1 (a). The full curve denotes both the functional behaviour of f(x) 
and that of its Fourier series in the fundamental interval, the dotted curve 
denotes the periodic extension of f(x) and the chain-dotted curve denotes the 
actual behaviour of f(x) outside the fundamental interval. Of course, it still 
remains for us to justify our assertion that the Fourier series converges to f(x) 
in [—7, π], but it certainly does so when x = 0 and x = +7. This follows 
by employing the standard results 


ar ae + 

12 22° 32 42 
and 

ne: + 4 + 

6 22 32° 4? 


to evaluate the Fourier series at those points. 


SEC 16-1 INTRODUCTORY IDEAS / 705 


Fig. 161 (a) Fourier series representation of f(x) = 7? — x?, 


Let us define the mth partial sum S,,(x) of this Fourier series to be 


m-1(—|)r-1 
ΕΝ COS HX. 
n 


Sm(X) = ξπ2 + 4 


n=1 


Then, when working numerically with the Fourier series, the function will 
need to be approximated by a partial sum. The behaviour of the second, 


third, and fourth partial sums 
So(x) = ξπ2 + 4cos x, 
S3(x) = $77 + 4cos x — cos 2x, 


S,(x) = ἔπ + 4cos x — cos 2x + § cos 3x, 


| is Shown in Fig. 16-1 (Ὁ) and they certainly suggest the convergence of S,,(x) 
| to f(x) over [—7, z]. 
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Some mca 


Approximation of 7? — x? by partial sums. 


Example 1652 Determine the Fourier series expansion of the function 
(x) = |x| for -m<ix<7. 


Solution As before, we have from Eqn (16-8), that 


l 7 

ee x| cos nx dx, 

where n = 0, 1,.. .. When ἢ = Ὁ this yields 

1 f° 1 (7 

a =- ἢ —x)dx+- x dx 
ae oe T Jo 
2 Γπ 

ΞΞ - [ xdx ΞξΞ σ. 

vis 


«ὦ 
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When ἢ + 0 we have, 


7 


1 fo Ϊ 
a, = - | (—x) cos nx dx + -- | x cos nx dx 
qT - π WT Jo 


2 fa 
Ξε -- | x cos nx dx 

T Jo 
= 2 (= sinnx — COs | π 
oat n n? ᾿ 
Ν —4 
᾿ πίη + 1)? 


A moment’s reflection shows that the form of argument used to establish 
that 6, = 0 for all nin the previous example succeeded because the function 
ΚΛ) involved was an even function, while sin nx is an odd function. Here, as 
(x) = |x], we are again dealing with an even function so that once again we 
may conclude that 


b, = 0 for all n. 
Hence the Fourier series of f(x) = |x| in —7 < x <7 has the form 


4 Ι 
K(x) = ἐπ — = (cos x + 55 0s 3x Ἐ 608 Set. ος 
TT 


1 
Onli ms ee , : 


Again, this certainly converges to f(x) when x = 0 and x = +7, as may 
be seen by assigning to x the appropriate values and employing the standard 
result 


to evaluate the Fourier series at those points. 


In Fig. 16-2 (a) the full line denotes the behaviour of the function f(x) = 
{x{ and its Fourier series in the fundamental interval [—7, πὶ, the dotted 
line denotes the periodic extension of f(x) outside [—z, 7], and the chain- 
dotted line denotes the actual behaviour of f(x) outside [—7, 7]. The behaviour 
of the third partial sum 


4 Ι 
S3(x) = ἐπ -- -- = (cos x+-—= 32 COs 3x 


CH 16 
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on τ ne Ὁ t ‘On 


Fundamental interval 


Fig. 16-2 (a) Fourier series representation of f(x) = |x| for -- π -χ -Ξ π. 


πσπ 


Fig. 16:2 (b) Approximation of |x| by the partial sum S3(x). 


is Shown in Fig. 16-2 (b). Again, it appears reasonable to suppose that in the 
limit of large n, S,(x) will converge to f(x) for all x e[—z, π]. 


Example 16:3 Determine the Fourier series expansion of the function 


afor —7r<x<0 
oe ὑ οι Ὁ -: χ -π. 
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Solution Proceeding as before and using the notation of Eqn (7:19), 


| 0- l 7 
a, = - | a cos nx dx += | bcos nx dx 


forn = 0,1, 2,. . ., and 
| ae lf7 
b, τε -- asinnx dx + - b sin nx dx 
7 J-a7 7 Jo+ 


forn=1,2,.. .. 


A simple calculation shows that 

a5 =a + δια, = 0 for n= 1,2... 24; 
and 
2(b — a) 


τσ fi ἘΞ 1 ee 
Onin ae 


bon = 0, Ban +1 = 


Substitution of these Fourier coefficients into Eqn (16-1) then shows that 
the Fourier series of f(x) in [—7, π] is 
a+b 
2 


fc) = ( ἡ += (= ay(sinx + }sin 3x Ὁ Bsin Sx te 


I 
rep τὼ (2n + 1)x ws .). 


This Fourier series certainly converges to the function f(x) when x = 
+ $7, as can be seen by assigning these values to x and employing the final 
result of Example 12-12 to sum the series. Observe though that in this case 
the Fourier series assumes the value (a+ 5)/2 for x =0 and x = +7 
which is not in agreement with the actual functional values at those points. 
It is, in fact, the average of the functional values to the immediate left and 
right of the discontinuity at x = 0. This result is not coincidental, and later 
we show it to be true of all Fourier series at jump discontinuities. The 
approximation of f(x) by the third partial sum 

S3(x) = [ x Ἵ + “( — a)(sin x + 4 sin 3x) 


in the case a = 0, b = 1 is shown in Fig. 16-3 in which a circle denotes an 
end point not included and a dot an end point that is included. 
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a ey αν, 0 π χα 
Fig. 16-3. Approximation of f(x) = 0 for --π <x <0 δηά [(χΧ) ΞΞ 1 ἴογ “χ ξπ 
by the partial sum S,(x). 


In concluding this section let us emphasize how general the class of 
functions may be for which Fourier series can be found. Observe that in 
Example 16:1 the function was continuous and it had a continuous derivative 
throughout [—7z, 7]; in Example 16-2 the function was continuous every- 
where throughout [—7, πὶ but its derivative was not defined at the origin; 
while in Example 16-3 the function was discontinuous at the origin. 


16-2 Convergence of Fourier series 


The three examples studied in the previous section suffice to indicate not only 
that Fourier series can be associated with widely differing types of function, 
but also that the convergence properties of the resulting series require careful 
attention. In fact, in the examples considered, the series could only be 
summed at special points in [—7, 7] by utilizing the known series for 47, 
47° and 4s7? and, although the behaviour of the partial sums was suggestive, 
so far nothing rigorous can be inferred about their sums at any other points 
in [—7, 77}. 

It is now time for us to regularize our approach to Fourier series, and 
this task we undertake in the present section. A reader who is concerned only 
with the use of Fourier series, and is prepared to forgo this discussion of 
convergence, may proceed directly to the next section after first considering 
Theorem 16-1 which provides the justification for what is to follow. We 
shall start by giving a formal definition of the Fourier series of an integrable 
function f(x) defined on [—z, 7] and, under simple assumptions regarding 
f(x), proceed to examine the nature of its convergence. Usually until the 
convergence problem has been resolved, it is customary to denote the 
relationship between f(x) and its Fourier series by the sign ~ instead of an 
equality. 
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DEFINITION 16:1 (Fourier Series) 


The Fourier series of an integrable function f(x) defined on the interval 
[-- π, 7] 1s the series 


Ss) ~F + Σ (a, cos nx + b, sin nx), 
in which the Fourier coefficients a,, δ. are given by 
An --{" f(x) cosnx dx forn =0,1,..., 
and 


1 π 
b= | f(x) sinnx dx forn = 1,2,.... 
7 - 77 


The main result of this section will be the establishing of a fundamental 
theorem on the convergence of the Fourier series of a function f(x). However, 
as this will require several subsidiary results which are important in their 
own right, we now establish them in the form of two lemmas. 

First, using the summation formula for a geometric progression (cf. 
Problem 1-52 (a)), it follows immediately that for x not a multiple of 27, 


Σ eirz ms 
__ exp [i(n + 4)x] — exp (4ix) 


exp (4ix) — exp (— 4ix) 


__ exp [i + 3)x] — exp (ix) 
- 2i sin 4x 


e™ — 1 exp(inx) το ἢ 
l—e-#* 1-- exp (—ix) 


Hence, equating the real parts of this equation, we deduce that 


n sin (n + 4)x 
1 ee ae 16:10 
Ὁ Σ arene 2 sin $x ( ) 
Integration of this expression over the intervals [—7r, 0] and [0, 7] shows at 
once that 


; ; τ ἢ | 
[ au = | sin (a + ἐὴμ οι 
0 


: : = $7, 16-11 
_q7 sin $u sin 4u il ( ) 


since the only contribution from the left-hand side of Eqn (16-10) arises 
from the constant term. Here, for convenience later, the dummy variable of 
integration has been denoted by u. 

Now define the mth partial sum S,(x) of the Fourier series of f(x) to be 
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Sipe o + S (a, cos rx + ὁ, sin rx). (16-12) 
r=1 


Then, by virtue of the definition of the Fourier coefficients a,, b,, we may 
write 


S,(x) = -- " f(t) dt + - > cos rx ,- I(t) cos rt dt 


«᾽ - 


+ sin rx | f(t) sin rt ar], | (16-13) 


where the dummy variable r has been employed in the integrals defining the 
coefficients a,, b, to avoid confusion with the variable x occurring in S,(x). 

Taking the functions cos rx, sinrx under the integral signs, which is 
permissible because x is not a variable of integration, and employing the 
trigonometric identity cos r(x — t) = cos rx cos rt + sin rx sin rf then allows 
us to write Eqn (16-13) in the form 


S,(x) = - " fo E + Σ cos r(x — η] dr. 


Applying identity (16-10), and writing x — ¢ = u, this then becomes 


Ι [ttm sin(n + 4)u 

δι(.) = a Jen oe 2 sin ἐμ 

The trigonometric factor in this integrand has period 27 so that if, for 
the purposes of the study of its Fourier series, 70) itself is also regarded as 
periodic with period 27, then the entire integrand 15 periodic with period 27. 
Consequently, a definite integral of this function taken over any interval of 
length 27 will be the same, showing that we may replace the limits x — 7 and 
x + m by —7 and 7, respectively. This assumption of the periodicity of the 
function f(x) outside [—z, 7] in fact places no restriction on f(x), because the 
Fourier series can only represent f(x) in the fundamental interval, so that 
how f(x) is defined outside it is immaterial. Hence we have established the 
following useful lemma. 


LEMMA 16-1 (ntegral representation of S,(x)) The nth partial sum of the 
Fourier series of the function f(x) belonging to the fundamental interval 
[—7, 7], and defined by periodic extension outside tt, may be represented 
in the form | 


S,(x) — - [ f(x a u) sin (n + 3)u du 


2 sin du 
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Let us suppose that f7(x) is an integrable function over [—7, 7], in the 
sense that its integral is finite, and consider the obvious identity 


-ἥ- 


" [f(x) -- δ.(Χ}}" dx = (’ f(x) dx — 2 ὃ [()5,(Α) dx ἊΝ δ, (ἝἹ dx 


σπ 


((6:14) 


Now it follows from the definition of the Fourier coefficients, the ortho- 
gonality property of the trigonometric system (16:5) and the form of S,(x) 
in (16-12) that 


" §,2(x) dx = Ι " fCOS,(x) dx = 


Combining (16-14) and (16-15) then yields 


Ta 


2 n 
᾿ +a > (a,2 + 6,2). (16:15) 
r=1 


[νῷ -- soar ax = [" ponds - α] ΞΕ 4 Σ ων τ 55]. 
(16-16) 


As the integrand of the left-hand integral involves a square, it is either 
positive or zero, so that we may conclude 


Ay? 
2 


This is known as Bessel’s inequality and it is true for all ἡ. The fact that 
the right-hand side is finite by hypothesis implies that the sum of the squares 
of the Fourier coefficients must always be convergent. This result, coupled 
with the argument preceding Theorem 12-2 which established that the mth 
term of a convergent series must tend to zero, thus proves that 

lim a, = 0 and limb, = 0. (16-18) 


now ὸ tox 


n ] π 
ἘΣ (a? -Ὁ d2)<- | 750) ἀν. (16:17) 
γῈ1 Ἵ T 


’- 


Observe that when it is true that 


lim S,(x) = f(x) for -m< x <7, (1619) 


then Eqn (16-16) implies that 


2 oO I (* a7 
= ᾿Ξ Σ (α,3 + δ,3) = - [ΟἹ dx. (16:20) 
n=1 “π 


ΠΤ 


a’ 


This last result is known as Parseval’s relation but, as yet, we have still 
to deduce sufficient conditions for the limit (16:19) to be true. Parseval’s 
relation would also be true were limit (16-19) to be replaced by the weaker 
condition 


lim [" [ΚΑῚ — S,(x)}? dx = 0, 


n> © — 9 
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though the convergence of S,(x) to f(x) would then no longer be the type 


of convergence we have studied so far. Advanced studies of Fourier series 


exploit this more general notion of convergence which is usually known as 
convergence in the mean. 

In terms of the definition of the Fourier coefficients, the limits (16:18) are 
seen to be equivalent to 


Him [7 f(x) οος mx dx = 0 and tim " f(x) sinnxdx =0. (16-21) 


A> © n> @ ~ 7 


However, f(x) is an arbitrary function, and if it is identically zero outside 
the interval [a, δ], where —7 <a < b< 7, then the limits (16-21) are still true 
if the integration is only over [a, δ]. In addition, on account of the arbitrary 
nature of f(x), the first limit will still be true if f(x) is replaced by f(x) sin $x, 
and the second limit will still be true if f(x) is replaced by f(x) cos $x. Adding 
these modified integrals and simplifying the integrand then gives 


lim [τὸ sin (ΟἹ + 4)x dx = 0. (16:22) 


rx 


Collecting results together we arrive at the other lemma that will be 
required. 


LEMMA 16:2 (Properties of Fourier coefficients) Let f(x) be defined arbitrarily 
in the interval [—7, 7] over which f?(x) is integrable and let it be defined by 
periodic extension outside it. Then, if —7 <a < b< 7, it follows that 


(a) lim ° f(x) cos nx dx = 0 and lim ᾿ f(x) sin nx dx = 0; 


N+ LI -— 7 i alas | 5 


and 


b 
(b) lim Ϊ ΚΧ) sin (n + 4)x dx = 0. 


no 2 


Now, finally, we are in a position to prove our fundamental Fourier 
theorem on convergence. Consider a function f(x) defined on [—7z, 7] which 
has a finite discontinuity at χ = xo, and let f?(x) be integrable over this 
interval with the function being defined by periodic extension outside [—7, π]. 
Using the notation of Section 3-4, suppose also that f(xo_) = δ; and f(xo,.) = 
Ly. 

Then, from Lemma 16:1, we may write 


51η ΟἹ + 3)u 


| 7 
S.(6%0) = = | fet -- ιὴ au: (16-23) 


but multiplying the first result of Eqn (16-11) by f(xo_) and the second by 
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f(*o+), Which are constants as far as the integration is concerned, we also 
have 


sin (nm + au 
2 sin ἐμ 


iflve.)== [few SOA 


and 


ἔβα) = = [foso.) 


Hence, from the above results we deduce that 


du, (16-24) 


sin (n + 4)u 
a 
2 sin ἐμ 


(16:25) 


1.15 ὃ 
Sy(%0) = HLfl%0-) Ὁ fl.) = = [ (ποὺ = 0) = fro I τῆαι 
sin (n + 4)u re 


16:26 
2 sin ἐμ ( ) 


+- | “ἀξ 0) — flow] 


These integrands on the right-hand side are well defined everywhere 
except, possibly, at u = 0 where they require examination. The first integrand 
can be written in the form 


F,(u) sin (n + $)u, 
where 


F,(u) = (RE yy 


SIN ἐμ 


Clearly, as u +0 the second factor tends to unity, and when the left- 
hand derivative of f exists at x = Xo, the first factor tends to f’(xo_). So that 
F,(0) = f’(xo_), and the first integrand is also well defined at u = 0. Simi- 
larly, if 


Εχιὴ = nl Fr 


and the right-hand derivative of fexists at x = Xo, then F,(0) = f’(xo,) and 
the second integrand is also well defined at u = 0. We may thus write 


Sse) — Hflo-) + for) = = [ΒΩ sin n+ Dud 


l 7 
+ - | Fi(u) sin (u + 4)u du. (16:27) 
7 0 
Then, applying Lemma 16-2 (b) to the right-hand side, with a = —7, 


δ = 0 for the first integral and with a = 0, ὁ = w for the second integral, 
we conclude that 
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lim{S, (Xo) — $[f(%o-) + fixo+)]} = 0. — (16-28) 


ἮὮ-ὸ 


Hence we have proved our main result 


lim S,(%o) = BLf(Xo-) + f(%o+)] 


= 3[L, + Ly]. (16-29) 


If f(x) is continuous at x = Xo, though its derivative need not be con- 
tinuous there, then L; = Ly. = f(xo) and it follows directly that 

lim S,(%) = f(Xo). (16-30) 

We have thus proved one form of a Fourier theorem on convergence 
of Fourier series which may be stated as follows. 


THEOREM 16-1 (A Fourier theorem) Let f(x) be a piecewise continuous func- 
tion defined arbitrarily on the interval [—z, 7], and by periodic extension 
outside it. Then, if f?(x) is integrable over this interval, and f(x) has finite 
left-hand and right-hand derivatives at its points of jump discontinuity, it 
follows that 


(a) when x = Xo is a point of continuity of f(x) then 


lim S,(%o) = I (Xo); 


N+ © 


(Ὁ) when x = xy is a point of discontinuity of f(x), then 


lim S,(x) = $[fl%o-) + flxo+)]. 
n> © 
This theorem fully justifies our intuitive approach to Fourier series in 

Section 16-1 and, although it can be proved under more general conditions, 
it will suffice for almost all practical purposes. Although, strictly, when 
dealing with piecewise continuous functions we should always use the sign ~ 
in the Fourier series to allow for the result of Eqn (16-29), we shall instead 
use the equality sign and leave to the reader the task of interpreting its 
meaning at isolated points of discontinuity. One useful consequence of this 
theorem is that it provides a means of deriving a great variety of series 
expansions for mathematical constants. Let us illustrate this in the form of 
an example. 


Example 16-4 (a) Deduce a series expansion for 7? using the Fourier 
series expansion for f(x) = πὸ — x? in the interval [—7, 7]. 
(b) Given that the function 


0 for —7< x <0 
fis) =| force es 
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has the Fourier series 


cos (2n — sin nx 


Sa = > Cay’ 


fix) =5- 


deduce a series expansion for 7? and compare its convergence with that of 
the series in (a). 


Solution (a) The function f(x) is continuous in [—7z, 7] so that by Theorem 

16-1 its Fourier series converges to f(x) at every point of that interval. 

Hence using the Fourier series derived in Example 16-1 with x = ἐπ yields 
2 Ι 2 


-- +4 (cos 5 = cos τ. : 
π ao rae 57 53 7 των; ὦ eee 


which, after simplification, gives the result 


l l l l 
πὸ τ- 48 (55 - παι αὶ} 


This is an alternating series with nth term a, = {-- 1)} 1(12|μ2}), so that by 
Corollary 12-7 the remainder R,, after n terms is such that 0 < |R,| < 12/ 
(n + 1)*. Thus, summing to nine terms, the absolute value of the remainder 
cannot exceed 0-12. In fact the ninth partial sum yields the approximation 
mw ee 3:1528. 


(b) The function f(x) is continuous everywhere in [—7, 7] except at 
x == a where, by Theorem 16-1 and the periodic extension of f(x) beyond 
that interval, the Fourier series converges to 3[f(m—) + f(7+)] = (7 + 0) 
= ἐπ. Hence, setting x = 7 and f(7) = ἐπ in the given Fourier series, we 
obtain 


— TT 


1 
ae i 


2 2 cos = — , sin nt 
= are 


n=1 2 ( 


whence 


J 
πὸ τ 8 (1 Ἐ τὸ Ὁ τ Se μι 


This is, in fact, the series quoted in Example 16-2, and it is not an alternating 
series. The remainder term R, may be deduced from Corollary 12-4 from 
which it is seen to satisfy the inequality R, < 8/n showing that Ry < 8/9 
which, as it happens, is a gross overestimate of the error. In fact the ninth 
partial sum yields the approximation 7 ~ 3-1060. 
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16:3 Different forms of Fourier series 


A number of special forms of Fourier series occur depending on whether 
or not the fundamental interval is [—7, 7], or the function f(x) is even or odd. 
These are of sufficient importance to merit recording them formally. 


(a) Change of interval 


If f(x) is defined arbitrarily in the fundamental interval [—7, 7], and by 

periodic extension outside it, then the integrands of the integrals defining the 

Fourier coefficients a,, δ, in Definition 16-1 will be periodic with period 27. 
Consider, for example, the integral 


|᾿ " f(x) cos nx dx. 


Then from the property of the definite integral we may write 
a + 97 -— 77 7 
| f(x) cos nx dx = | f(x) cos nx dx + | f(x) cos nx ἀχ 


“+ ἣ fix) cos nx dx. (16-31) 


If, in the first integral on the right-hand side, the vartable change u = x + 27 
is made then, using the fact that reversing the limits of a definite integral 
changes its sign, and the periodicity of fimplies f(u — 27) = f(u), we see that 


Tv a+ 7 
| flu — 27) cos nu du = — | J(u) cos nu du. 
a+ ΠΤ 7 


Combining this result with Eqn (16-31), and changing the dummy variable 
u to x, gives 


[τὼ cosnx dx = -- Ι΄ “0 cos nx dx + [" f(x) cos nx dx 


a+ IT τ 
+ | f(x) cos nx dx = | f(x) cos nx dx. 


The implication of this result is that for any «, the Fourier coefficient 
a, 1s given by 


l a+ 7 
a,=- { f(x) cos nx dx. (16°32) 
ΤΙ ),α- π 
A similar argument establishes an equivalent result for b, and we have thus 


proved the next theorem. 


THEOREM 16.2 (Change of origin of fundamental interval) If f(x) is defined 
arbitrarily in the fundamental interval [—7, 7] and by periodic extension 
outside it then, for any «, the Fourier coefficients a,, ὃ, of f(x) are given by 
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1 a+ a7 
αι τε | (x) cos nx dx forn =0,1,.. 
WT 


Q- 7 


1 a+ 7 . 
b= = | f(x) sin nx dx forn = 1,2,.... 
ΤῸ Ja-a7 
The Fourier series of f(x) then has the form 


f(x») == st Σ (a, cosnx + ὃ, sinnx) fora -- π -ΞΞ χεἕζαὰ -- π. 


n=l1 


It often happens that the fundamental interval to be used is [—L, L] 
instead of [—7, 7]. When this occurs, the simple variable change { = 7x/L 
maps the interval —L < x < L onto the interval —7 < 1 < x for which we 
already have a Fourier expansion theorem. So, if f(x) is defined on the 


Lt 
fundamental interval —L < x < L, we see that the function f(x) = f (=) = 
TT 


g(t), say, is defined on the interval —7 < t< 7. Hence, using our ordinary 
Fourier expansion on —7 < tf < 7 we can write 


g(t) = = + ἘΣ (a, cos nt + b, sin nt) 
with 


Ϊ π 
a, =~ | g(t)cosntdtforn=0,1,..., 
ΠΤ - ΠΤ 


and 


] π 
b, = - | ge(t)sinnt dt form =1,2,.... 
ΠΤ σπ 


To complete the argument, transforming back to the variable x, we arrive 
at the following theorem. 


THEOREM 16-3 (Change of interval length) The Fourier series of an integrable 
function f(x) defined on the interval [—L, L] is the series 


fo) = 2 + Σ (a. cos + δ, sin) 


with 


I L 
a= 7 [΄ flx)cos™ dx for n= 0 | ae 
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and 


L 
Ξ fx) sin dx forn =1,2,.. 
PS i 


Example 16-5 (a) Deduce the Fourier series of the function 


—x for —4n<ix< 0 
f(x) = {x for0O<x<7 
2π — xforn<x<in 


(b) Deduce the Fourier series of the function | 


f(x) = x8 for ~lL<x<l. 


Solution (a) Defining f(x) by periodic extension outside the fundamental 
interval —4a7 < x < ὅπ, it is easily seen that the graph of f(x) is part of ἃ. 
sawtooth function (Fig. 16-4). Comparison of Fig. 16-4 and Fig. 16-2 (a) 
then shows that the function f(x) is in fact just that part of the function 
f(x) = |x| and its periodic extension outside [—7, 7] that lies in the interval 
[-- ἔπ, 37]. Hence, from Theorem 16:2, the Fourier series deduced in Exampie 
16:2 and the one required here must be the same. Ὁ ΘΒΕΕΠΠΡΠΟΣ; without 
further work, we may write 


7 3? 


4 ] ] 
ΚΑῚ ΞΞ ἐπ -- - (cos x +— cos3x + τῇ οοϑῦχ +.. ᾿ 
for --ἶπ -Ξ χ -Ξ ὃπ. 


Notice that had the result of Example 16.2 not been available, rather 


ΒΝ ἐπ 0 4x π 3 π 
Fundamental interval 


Fig. 16-4 Fourier series with fundamental interval [— 7/2, 37/2]. 
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than integrating the three part functional form given in the present example 
over the interval [—}7, ὅπ], it would have been easier to use Theorem 16-2 
to justify integrating the simpler function f(x) = |x| over the more convenient 
interval [—7, 7]. 

(b) In this case the fundamental interval is [—1, 1] so that setting L = 1 
in Theorem 16-3 gives for the Fourier coefficients of f(x) = x° the values 


1 1 
ὦ; = [ χοὸς πα ἂχ and , = } x sin nox dx. 
«-1 Δ 


e- 


Routine integration then shows a, = Ὁ forn =0,1,.. ., and 
2(6 — n?z? | | 
ἘΞ a ΤΕ 
TT 


Hence the Fourier series of f(x) = x° in the fundamental interval —1 <x < 1 


7) mer Σ (—1)" 6- 5) Sin ἤπχ. 


Since the periodic extension of f(x) is discontinuous at x = +1, the Fourier 
series at these points will converge to the value 4[f(1—) + f(1+)]. In this 
case this value is zero, since f(l—) = Land f(l+)=—1. | 


(b) Fourier sine and cosine series 


When f(x) is an even function defined over the interval [—z, 7], then f(— x)= 

f(x). It thus follows directly that f(x) cos nx is an-even function, because 
cos nx is even, and f(x) sin nx is an odd function, because sin nx is odd. 
Consider the Fourier coefficient a, of an even function f(x) which we choose 
to write in the form 


A, = ἊΝ fix) ec cos nx dx + ze f(x) cos nx dx. 
Then, changing the variable in the first integrand by writing vu = —x, 
employing the even nature of the integrand to replace f(—u) cos n(—u) by 


(u) cos mu and changing the sign of the integral by reversing the limits we 
fai Bing ἕξ Ε y = 
n 


a, = : [ f(u) cos nu du + - we f(x) cos nx dx. 
TT Jo T Jo 


The variable u is only a dummy variable so it may be replaced by x to 
give the result 


| aii | Pe | ᾿ 
a; == | f(x) cos nx dx forn =0,1,.... - (16-33) 
T Jo ; = oe 
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This same argument applied to the coefficient δ, shows that 

b,=0Oforn=1,2,.... (16-34) 
Consequently, when f(x) is an even function in [—7, 7], its Fourier series 
contains only cosine functions and is of the form 


fe) = = ἘΣ a,cosnx ἴοσ --π τ χ -Ξ π. (16-35) 
n=1 ᾿ 


This is called the Fourier cosine expansion of the even function f(x) in [—7, 7]. 
When f(x) is an odd function defined over the interval [—7, 7], then 
f(—x) = —f(x). The form of argument just used then establishes that 


a, =OQforn=0,1,..., (16-36) 
and 
2 (7 
δ, =- | J(x) sin nx dx forn = 1, 2,...,, (16°37) 
T Jo 


from which it follows that the Fourier series of an odd function defined tn 
[—7, 7] contains only sine functions and is of the form 


f(x) = > δ, sinnx for --π Ξῷ χ ΞΞ π. (16-38) 
n=1 
This is called the Fourier sine expansion of the odd function f(x) in [—z7, π]. 
These results can be usefully interpreted in terms of any arbitrary function 
f(x) which is to be expanded in the half interval [0,7]. Defining a new 
function g(x) by the rule 


, _ [f(—x) for —-r= x <0 
gx) = ἕν ἴογ 0 -- χ Ξξ π, (16°39) 
we see that g(x) is an even function which is equal to f(x) in the required 
interval [0,7]. Thus, as a Fourier cosine expansion of g(x) only requires 
knowledge of g(x) in the half interval [0, 7] in which g(x) = f(x), it follows 
that Eqn (16-35) provides the desired expansion of f(x) when x is restricted 
to the interval O< x <7. 

Alternatively, we may expand the same function f(x) in the half interval 
[0.. +] in a Fourier sine expansion as follows. Define a new function A(x) by 
the rule 


—f(—x) for —-7r=<x<0 


ONES f(x) forO<x< 7. 


(16:40) 


Then A(x) is an odd function which is equal to f(x) in the required interval 
[0, 7]. The Fourier sine expansion of A(x) only requires knowledge of A(x) 
in the half interval [0, 7] where A(x) = f(x), so that Eqn (16-38) provides the 
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desired expansion if x is restricted to the interval 0 < x <7. For obvious 
reasons, these expansions are often called the half-range expansions of f(x). 
We have now proved the next theorem. 


THEOREM 16.4 (Fourier sine and cosine series) If f(x) is an arbitrary function 
defined and integrable on [0, π], then it may either be expanded as a Fourier 
cosine series 


fo) =F + > a, cos nx forO< x<7, 


n=1 


in which 


2 T 
a= | f(x) cos nx dx forn =0,1,..., 
Jo 


or as a Fourier sine series 
foo) 
fix) = Σ᾽ 6, sinnx forO< χ --Ξ π, 
n=1 
in which 


2 π 
ῥ,, ΞΞ - | f(x) sin nx dx forn = 1, 2,.... 
7 0 


Example 16-6 Deduce the Fourier cosine and sine series of the function 


f(x) = x in (0, π|. 


Solution From the first part of Theorem 16:4 we have 


2 τ 
a, == | cosnmx dx forn=0,1,.... 
T JO 


A simple integration then shows 


-4 


- ἐξ τ i=. 
HOR ee 


Qo = 7, Gon-1 = 


The Fourier cosine series of f(x) = x thus has the form 


π 4 2 cos (2 -- 1)x 
χ)ξε- - -- ———_———— [ργθ-ῖ χε π. 
I) 5 7 ἢ ἀρ ate ᾿Ξ χει π 


From the second part of Theorem [6.4 we find 


2 Ρπ 
b, == | x sinnx dx forn=1,2,.. .. 
0 


TT 
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Evaluating this integral gives 
2 
b, = (—1)"*? - form = 1,2,..., 
n 


so that the Fourier sine series of f(x) = x is seen to have the form 


sin AX 


fo) =2 3 (-p" 


From the practical point of view the cosine series is preferable to the sine 
series in this instance because it converges more rapidly. 


(c) Complex form of Fourier series 

Suppose the arbitrary function f(x) is defined on the interval [—L, L]. Then, 
as in Theorem 16:3, its Fourier series will contain sine and cosine functions 
with argument m7x/L. If these are expressed in terms of the complex exponen- 


| amx\ , : ; ; 
tial function exp [ =| it transpires that the series may be written in a 


convenient complex form. To achieve this, observe first that a simple integra- 
tion gives 


L mx  AnWXx Oform ~ —n 
'..-... -- Ἰ6χπ- : [6:4] 
5 a [ L ) exp [ "ἢ . ae —n oo) 


First we assume that f(x) is continuous and satisfies the conditions of 
Theorem 16:1, so that its Fourier series always converges to f(x). Then, in 
place of Definition 16-1, it is natural to write 


fe) = Σ Anexp [ =), (16-42) 
where the complex coefficients A,, are to be determined, and the summation 
indicates that m= 0, +1, +2, .... Multiplication of Eqn (16-42) by 


exp (- =) and integration over [—L, 1] then leads to the result 
1 {Ὁ _nmx 
A,=—= | F(x) exp [-: ) dx form = 0, +1, +2,.... (16:43) 
0 Dp L 


Using these coefficients in Eqn (16-42) gives the complex Fourier series 
of f(x) which, when simplified, naturally reduces to the same series that 
would have been obtained had Definition 16-1 been employed. If the symbol 
~ is used in place of the equality in (16-42) with the sense that was explained 
after Theorem 16-1, we immediately arrive at the following alternative 
definition of a Fourier series. 
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DEFINITION 16:2 (Complex Fourier series) The complex Fourier series of an 
integrable function f(x) defined on the interval [—L, L] is the series 


fw~ 2. A, exp [ =), 


in which the complex Fourier coefficients A, are given by 


, — | 
A, = 5 | floexp (-'=) dx ἴογ ἢ = 0, +1, +2,.... 


Example 167 Find and simplify the complex Fourier series of the function 


_ fafor —-r<x<0 
for) = {5 for0<x<n7. 


Solution Using Definition 16:2 with L = a we can at once write 


1. [Ὁ Ι ῥτ a+b 
A =— — - 
0 sz | ade +5 | b dx 7 


T Jo 
a [Ὁ ῤῥ [π ; 
A, Ξε -- | exp (—inx) dx + =| exp (—inx) dx 
DT) ΡΣ am Jo 
i(a — ὃ 
- SF ι.- (Ὁ 
πῇ 


and 


a [° b fr 
A.,=— j : χόρτο Ἧ: 
aaa τ qe: exp (inx) dx + an [ exp (inx) dx 


—i(a — b) 
= [l= (- 1") 
πῇ 
Hence the complex Fourier series is 


a+b = 


: + »> A, exp (inx), 


f(x) = 
in which A,, A_, have the values just computed. 


Combining the general terms for +x in the complex Fourier series we 
obtain 


Ὁ for even n 


ny —iInx) = b— 
A,, exp (inx) + A_, exp (—inx) ὅτι [Π — (—1)"] sin nx for odd n. 
ΠΤ 
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Hence the complex Fourier series of f(x) for —7 <x < π reduces to the 
real series 


a+b © sin (2m --- 1)x 


flo) = ( : εξ -ὦ Σ 


mal 2m — ] ° 


which, as would be expected, is exactly the result obtained in Example 16:3. 


164 Differentiation and integration 

In this section we complete our discussion of Fourier series by examining 
the important questions of when the derivative and integral of the Fourier 
series of a function f(x) are equal, respectively, to the derivative and integral 
of the function f(x) itself. 


(a) Integration 


Suppose that the function f(x) is defined in the interval [—7, 7], and let g(x) 
be defined by the requirement 


0 ἴογ —7w<ix< x, 
(x) = (1 forx,<x< Xx, 
Oforx%s-< 4 = ἢ. 


Using the obvious algebraic identity 
fg) = HC) + g@OP — Lf) -- 50}}", 


the properties of g(x) then allow us to write 


7 Tg ] 
4 f(x)g(x) dx = | f(x) dx = 7 | 


x 


* Tf) + 11} ἀν 


-Φ Lf(x) -- 1] dx. (6.44) 
Ty 

However, if the Fourier coefficients of f(x), g(x) are a,, ὃ, and a,’, b,’, 
respectively, and f(x) is continuous over [x,, X2] so that its Fourier series 
converges to the function f(x) in this interval, we know that Parseval’s 
relation must hold for the functions f(x) + 1 and f(x) — | since, being the 
sum of continuous functions, they are themselves continuous. Consequently, 
applying the result of Eqn (16:20), to the right-hand side of Eqn (16-44) we 
obtain 


Ι ft2 _ | 
- x)dx = 
ai f(x) dx 


AyAy’ 


ἘΝ | 
= a 2 (an + ay’)? + (ὦ, + 5,')?] 


=~ Σ [(a, a a,')? + (ὁ, " ».}} 


n=1 
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or, after simplification, 


ApAo 


| fe = 

τ Ϊ f(x) dx = oe Dea -Ἑ δ,0,"). (16°45) 
ΠΤ ty n=1 

Using the form of g(x) to find the Fourier coefficients a,,’, 5,’ then gives: 


; | fe (X2 — Xi) 
ay =- dy = —————_, 
ΒΗ 


TT 


᾿ Ι [12 ΝΠ ee ae 
(ie 5. cos nx dx and 6,’ = - sin nx dx. 
age ΠΤ 


1 Fy 


Inserting these results into Eqn (16-45) we finally conclude that | 


δὰ oO τ 
| fx) dy = τ (x, -- Δ) + > ; (a, cos nx + δ. sinnx) dx. (16°46) 
11 2 nm=1 Wf] 

In words, this result asserts that when f(x) is a continuous function, the 
integral of f(x) over the interval [x,, Χο] is equal to the result obtained by 
termwise integration of the Fourier series of f(x) over that same interval. 
In general, although convergent, this new series will not be a Fourier series 
because of the presence of the term ap(x, — x,)/2. The result we have 
established in (16:46) would still be true even if the convergence of the 
Fourier series to f(x) was only convergence in the mean, for Parseval’s relation 
would still be valid. Using this form of convergence, a more subtle argument 
establishes that termwise integration of the Fourier series of a piecewise 
continuous function is also always permissible. That is, the integrated 
Fourier series of a piecewise continuous function f(x) will always converge 
to the integral of the function f(x) itself. We have established our next 
result which we choose to state in a slightly more general form than we have 
actually proved. 


THEOREM [165 (Untegration of Fourier series) If f(x) is an integrable piecewise 
continuous function in the interval [—7, 7] with the Fourier series 


Ao 


f(xy) = ᾿ + > (a, cos nx + ὁ, sin nx), 
n=1 
then 
A(x — x cae 
| f(t) dt = sly = 2) ἘΣ { (a, cos nt + ὃ, sin nt) dt. 
τι n=1 τι 
Example 16:8 Discuss the result of termwise integration of the Fourier 
series of the function 


| —lfor—7<x<0 
fey =|" forO<ix<i7 


over the interval [--- π, x] with --π < x<0 7. 
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Solution Setting a= —1, b = 1 in the result of Example 16-3 shows that 
the Fourier series of εὐ iS 
] 
1) = (sin x i = aon 3x + Ξ sin 5Χ +. ν᾿ ς (A) 
We now need to apply Theorem 16:5 with x, = ~—z and x2 = x and, in 


doing so, because of the discontinuity in f(x) to consider the two cases 
—7<x<QOand0O<x<7z. 


Case (ἡ) --π <x <0: 
From Theorem 16:5 we have | 


[ (dre => * gsin(2n — |)t 


dr 
n=1 - oT (2n — 1) 


—(x + 7) 


TT 


4 (= cos (2n — Ith’ 
{> (2n 7a 1)? 


— 


" = {(co ggg OST 
cee eee 


The numerical series has the sum 7?/8 so that this result simplifies to 


cos 3x cos 5x 
32 δ: 


4 
—Y=hr-- (cos xX + | for —7 < χα < 0. (B) 
7 ' 
Case (Πγ}0 -Ξ χ -Ξ π: 
Here, as a result of an ars of Theorem 16-5, we have 


0 He ys « ΟὟ sin(2n — l)t 
[na +{ (1) dr = = >) πα ΩΣ 


n=1 ΠΤ 
which after simplification as before gives 


cos 3x ΕἸ cos 5x 
3? 5° 


wl 


4 , 
π -- = (cos x + 7 forO<y¥<i7, (C) 
TT : 
Results (B), (C) taken together comprise the series representing the 
function 


fx) =|" 


—x for —7<x< 0 
xforO0O<x<7. 
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This is simply the function f(x) = |x| examined in Example 16-2. In this 
case the result is a Fourier series because there was no constant term present 
in (A). 

(b) Differentiation 


Let us suppose initially that f(x) is continuous in the interval [—7, 7], that 
its derivative is also continuous and that its Fourier coefficients are a,, b,. 
Then the Fourier coefficients a, . 5,’ of the derivative tT (x) of f(x) are, by 
definition 


iy οιις 
a, =- | f(x) cosax dx forn =O, 1... .. 
7 «7 
and 
Ι (7 — 
b,' = -- [ J (x) sinnx dx forn= 1,2... .. 
7 «ΠΤ 


Integrating by parts gives for a,’ the results 


Ι ἐπ ." Ι 
ag! τι 5 | Ὸ ἀν = - Eft) — f—n)], 


ΤΙ 


a, = : Lc (05 NX) Ἔ ἢ [ἷ F(x) sin nx Ἢ 


= - cos nx[ fin) -- fl—)] + nb, 


forn = 1.2.4.2 2: 
and for δ, the results 


~ 


ῥ,.᾽ τε ι fe sin 7X] " — Nn " Sx) cos nx dx | 
ΠΤ - aT 


TT e 


= -- τ, 


SPOR pie: «53 


If we also require f(z) = f(—7), so that when f(x) is continued by periodic 
extension outside the fundamental interval [—7, 7] it is also continuous at 
X = +7, then at once it follows that 


ay = 0, a,’ = nb, and 6,’ = —ha,. 
Thus, writing down the Fourier series of f(x) in —7 <x < 7 we find 


7 ΑἹ = Σ (a,’ cosnx + δ, sin nx) 
1 


n= 
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= Σ᾽ nb, cos nx — a, sin nx) 
n=1 | 
ον aq 
> = (a, cos nx + 5b, sin x). 


n=i δ’ 


Hence we have shown that termwise differentiation of the Fourier series of 
a continuous function f(x), which has a continuous derivative together with 
the property that f(7) = f(—7). yields a Fourier series which is convergent 
to the derivative f’(x) in [—7, 7]. If the derivative f(x) is only piecewise 
continuous, then a slight modification of the argument yields the same result, 
apart from the fact that the differentiated Fourier series of f(x) does not then 
converge to f’(x) at points where the derivative f’(x) is discontinuous. Our 
final theorem may now be stated. 


THEOREM 166 (Differentiation of Fourier series) Let f(x) be a continuous 
function with a piecewise continuous derivative in the interval [—7, 7], and 
let f(7) = f(—7). Then, if the Fourier series of f(x) is 


f(x) = = + > (a, cos nx + 6, sin nx), 
n=1 


termwise differentiation of this series is permissible and 


x d | 
f= Σ a (a, cos nx + A, sin nx) 
x 


n=1 


for —7 <x <7, except at points where f'(x) is discontinuous. 


Example 169 Deduce the Fourier series of f(x) = x in [—7, 7] from the 
Fourier series of f(x) = 7? — x?. Give an example where termwise differentia- 
tion of a Fourier series is not permissible. 


Solution From Example 16-1 we see that the Fourier series of f(x) = π — x? 
in [—7r, 7] Is 
2 | 
7? — x Bag: + 4 (cos x - paces + ages - .. i 


As f(r) = 0 = 3ι--πὴ and f(x) and f'(x) are continuous, Theorem 16°6 15 
applicable so that differentiation of the above result yields 


—2y = —Asin x — ἔ 9 π 2ν + Zsin3x—.. .) 
whence, 
x = Asin x — }sin2x + }sin3x—.. .) 


for —1 <x <7. A direct calculation easily verifies that this is the correct 
Fourier expansion of the function f(x) = x for -m=ix<7m. 
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For our example of a Fourier series where termwise differentiation is not 
permissible we need look no further than Example 16-2, for there f(x) is not 
continuous in [—z, π] and also (πὴ # f(—7). If the resulting Fourier series 
in that example is differential termwise it yields 


2 
- (6 — a\(cos x + cos 3x + cos 5x +...) 
TT 


which is divergent for all x by virtue of Theorem 12-2, because the nth term 
cos nx does not tend to zero. 


PROBLEMS 


Section 16:1 
Find the Fourier series of each of the following functions. 


161 f(x) = x ἴογ -- π <x < π᾿ 
16.2 f(x) = 2 for —a <x < a, 
0 ἴὉΓ ---, χορ 
163 fx) = {horde 59 
16°4 f(x) = |cos x| for —7 < x < π. 
16°5 f(x) = |sin x| for -- π < x < π᾿ 
16°6 f(x) = x? — 2x for —-7 <x < π. 
16.7 f(x) = cos «x for —7 <x < rwitha £90, #$1,42,.... 
16°8 f(x) = sin «x for —~77 < x < zwitha #0,+1, 42,.... 
16°9 f(x) = sin? x for -- π <x < π᾿ 
—lfor~7<x< -—la 
1610 f(x) = | | for --ζπ <x <4a 
—lfordir<x< π. 
Ofor-r7<x<0 
16-11 f(x) = f forO <x <4n 
Ὁ ἴογ ἐπ -, χ - π1 


Section 16-2 


16°12 Apply Parseval’s relation to the Fourier series of the function f(x) = 2? — x? 
for —7 <x < = to deduce a series for 7*/90, 
16°13 Apply Parseval’s relation to the Fourier series of the function f(x) = x? for 
—7 <x < 7 to deduce a series for 74/90. 
16°14 Deduce an expansion for = by using the Fourier series for the function 
fi) = ee a SO 
sinx ἴογ 0 <x < σ. 
16°15 Using the Fourier series expansion of the function 
fe ae for —r<x<0 
4nfor0 <x - π, 
deduce that 


ty 14 1 1 1 4: 1 " ] 
cami: δ΄ τὰ τ 

1616 Express the function f(x) = e* as a Fourier series in the interval — π <oK< π. 
Use the resulting series to find a series representation for e. 
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16°17 Use the Fourier series expansion of the function f(x) = x + x?in -r7<x< 
7 to deduce a series for 77/6 by setting x = 7. 


Section 16:3 


16°18 Find the Fourier series of the function f(x) = x for 0 <x < 27. 
16°19 Find the Fourier series of the function 
_ fx* for --Ἠὠπ χε π 

fd) ἠδ oe -- x)? for7<x< aa 

16°20 Find the Fourier series of the function 
O for —inrn<x<0 
[οὐ Ξ i xforO<x<a 
Ofor7<x< on 

16°21 Find the Fourier series of the function f(x) = x? for 7 << x < 37, 
16°22 Find the Fourier series of the function f(x) = x for -l <x <1. 
16:23 Find the Fourier series of the function 

f(x) = |x| for -3 χα <3. 
16:24 Find the Fourier series of the function 

f() =sinxforO<x< σ. 
16°25 Find the Fourier sine series for the function 

f(x) =cosxinO<x<7z. 
16:26 Find the Fourier cosine series for the function 

f) = χ' forO0 <x < π. 
16:27 Find the Fourier cosine series for the function 

f(x) =e forO<x< 7. 
16°28 Find the complex Fourier series for the function [ΚΑῚ =e for-l<x<l. 

Use the result to deduce the ordinary Fourier series for this function. 


Section 16:4 
16:29 Given that the Fourier series of the function f(x) = 2x + lfor-aw<x<- 
is 
, Sin nx 


fa=i-4y cpr 


deduce a series expansion for the function g(x) = x* + x. Is this new series 

a Fourier series? | 
16°30 Apply termwise differentiation to the Fourier series of the function 

f(x) = εἰς “΄Μὅπξχεῦρ 

51ηὴ χ ἴογ 0 <x -Ξ π, 

to obtain the Fourier series of the function 

ee cee aih asic 

2 cosxforO0 <x < 7. 

Is the Fourier series of g(x) termwise differentiable for any xin —7 <x < 7? 

Examine the convergence properties of the differentiated series for 2(x). 
16:31 Theorem 16:6 is stated for a continuous function f(x) which has a derivative 

which is only piecewise continuous, though the proof was only given on the 

assumption that the derivative was continuous. Modify the process of 

integration by parts to allow for the functions f(x) which have derivatives 

which are only piecewise continuous and so complete the proof of the 

theorem as stated. 
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General Problems 


16°32 


16°33 


16°34 
16°35 


Find the Fourier series of the function f(x) = sinh ax for -- π - x < π. 
What is the value assumed by the Fourier series at x = +7? 
Find the Fourier series of the function f(x) = cosh ax for ~7 <x < 7. 
What is the value assumed by the Fourier series at x = +7. 
Find the Fourier series of the function fd = xsinx for --π - χα -- π. 
Find the Fourier sine series of the function 

xfor0< x <1 


ἐξ sealer ee gene 


16°36 


1637 


Find the Fourier cosine series of the function 
f(x) = x(7 — x) forO< χ - π. 
Using the discussion at the start of Section 16-4 (a) as a basis, prove that if 


f(x), g(x) are any two integrable functions defined on the interval [—7, 7] 


16°38 


and their Fourier coefficients are a,, b, and a,’, b,’, respectively, then 
2 


Apdo 
2 


1 7 oO 
— |  f@ge) dx = ἜΣ (Ann? + nbn’). 
— 7 n=1 
Apply this result to the case f(x) = |x|, g(x) = x? for -- π <x < π and 
hence deduce a series representation for 7*/96. 
Let 


Xo 


Τιοὺ = 5 


nr 
+ > (a, cosrx + β, sin rx) 
r=1 
for —7 < x < a where «,, fr are arbitrary coefficients. Then, if f(x) is an 
arbitrary integrable function for —~7 < x < π, use the orthogonality 
property of the sine and cosine functions to prove that the integral of the 
square of the error between f(x) and 7,,(x) 


vee 

A, = Ϊ [f(x) -- Tax)? dx 
— WT 

is a minimum when the coefficients «,, 8, assume the values of the Fourier 

coefficients of f(x). This is called the /east squares property of Fourier series. 

(Hint: Regard A, as a function of the variables x,, 8, and choose their values 

so that A, is a minimum.] 


Answers 


Chapter 1 

1-1 (a) —3, 3, —4, 4, —5, 5, —6, 6; 
(b) 64, 125, 216; 
(c) (0, 4), (0, —4), (4, 0), (—4, 0); 
(d) (0, 7, 7), (7, 0, 7), (0, 8, 8), (8, 0, 8), C1, 7, 8), (7, 1, 8); 
(e) 1. 

1.5 (a) BC A: B is proper subset of 4; (6) AN B=¢: disjoint; (Ὁ AN B= ¢: 
disjoint. | 

1-13 x ε A\(A\B) implies x € A, x ¢ A\B which in turn implies x ¢ A A B. Conversely, 

x€A B implies x € A, x ¢ A\B which in turn implies x € A\(A\B). 


1:17 Three mutually perpendicular axes each with two distinct points marked, respec- 
tively, H and 7 and each representing the outcome of tossing one of the three coins. 
The result of tossing three coins could then be represented by a point in space. 


1:19 It follows from Eqn (1:10) that P(4 τὰ B) = P(B)P(A | B). However, from Defini- 
tion (1:1) P(B| A) = P(A Ὁ B)/P(A), so that it is also true that P(A B) = 
P(A)P(B | A). 


1-23 (a) 1/36; (Ὁ) 25/36; (ὦ) 5/18. 


9 
‘ = 126, 
1:25 (ἢ 
rar (2) « (2) 


1-29 16/81. 
1:33 Not unique. For example, two rows could be interchanged. 
1-45 (4) α = 1100, ὃ = 1011, - ὁ ΞΞ 10111, --- ῥὃ Ξξ 1; 
(Ὁ) a = 11-0001, ὁ = 0-01, a + ὃ = 11-0101, a — ὃ = 10-1101; 
(Ὁ) a=0-011, ὃ = 0-:0001,a+ 5 =0-0111, a — ὃ =0-0101. 
1-47 (a) 1000; (Ὁ) 2:2; (c) 2:001; (4) 0-0112, 
1.49 (a) a+56=20-1,a—b =1-2, ab = 1001; 


(b) a+b = 1001, -- ὃ ξξ 222, ab =0-1; 
(Ὁ a+b =0-021, a — δ = 0-012, ab = 0-00002. 
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1.53 General solution un, = A(—3)" + B2"; Particular solution un = -ὦ (—3)" 
2 
- 2", 
= 5 


155 General solution un = A + 1B; Particular solution uw, = 1 + 7. 


Chapter 2 


2.3 (a) [1,2]: many-one; (Ὁ) (5, 17): one-one; (c) [L, 17]: many-one; (d) [1, 10}: many- 
one. 


2.5 f(n) = 3 for n = 7, 8, 9, 10, 11, 12, 13. 
2.7 Possible examples are: 


(a) the number of integers between n? and n° as a function of the integer variable n; 


(Ὁ) the number of positive real roots of the equation x” + x7-1 4 xn-2 4... 
+ x + 1 = 0 asa function of the integer z. 


2.9 The domain comprises the names of all driving examiners, and the range comprises 


all qualified drivers: The mapping is one-many because each named examiner 
issues many licences. 


2-15 (a) neither even nor odd; (Ὁ) odd; (c) neither even nor odd; (4) odd; (e) even; 
(ἢ odd; (g) neither even nor odd. In (h) to (j) the first group of terms is even and 
the second is odd: (h) f(x) = (1 + x sin x) + x with the interval [—2z, 27]; 
(i) ΚΑ) = 1 + @ + |x| sin x) with the interval [—37, 37]; () f@&) = ( + 2x?) 
— (x — 4x3) with the interval [—3, 3]. 

2°17 (a) 0 is simply a lower bound, 11 is a strict upper bound; 
(b) 0 is a strict lower bound, 2 is simply an upper bound; 
(c) neither upper bound 2 nor lower bound 1/6 are strict; 
(d) 0 is simply a lower bound, 1 is a strict upper bound. 

2:19 Possible examples are: x3 + x + 1, x4 — x3 + 2x? +x 41, x5 443 4 x2 42x 
+ 2, (x? — 4)/(x5 + 2x4 — x2 + % - 7). 

2:21 x3 maps [1, 3] into [0, 30]; 
x + sin x maps [0, ἐπ] onto [0, ἐ(2 + πὴ}; 
x* maps (1, 4] into [1, 16]; 
x4 maps [—1, 2] onto [0, 16}. 


2:23 Domain of fog is [-6, 3]: Range of fog is [2, 5]. 


2-29 Part of ellipse centred on origin with semi-major axis 2 drawn along x-axis and 
semi-minor axis 1 drawn along y-axis. Curve lies in the region x > 0. 

2°33 Possible parameterizations are: 
(a) x = 2t,y = 407? + 2t+1 ἴοτ  - κ᾿ Ξ 4; 
(b) x =2sint, y= 4sin?¢ + 2sint+1 for0 <¢ < ἐπ. 


2.35 Cross-sections by planes x = constant are straight lines and cross-sections by 
planes y = constant are parabolas. 


2.37 Domain of definition is annulus centred on (1, 2) with inner radius +/2 and outer 
radius 3. Level curves are concentric circles centred on (1, 2). 
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Chapter 3 


3:7 There is no unique interval in either case. In (a) the interval is of the form (α, 2 + 8), 
where ¢ > 0 is arbitrary, and 23/11 < α < 21/10. In (Ὁ) the interval is of the form 
(ὁ, 2 + €), where ¢ > 0 is arbitrary, and 2003/1001 < ὁ < 2001/1000. 

3.9 Limit points are: 
0, corresponding to a sub-sequence with πὶ odd; 


1, corresponding to a sub-sequence with z a member of the sequence {4, 8, 12, . . .}; 
—1, corresponding to a sub-sequence with n a member of the sequence {2, 6, 


10,.. .} 
3:11 Limit points are 1, 0 and they are not members of the sequence. 
3-13 nth term μη = 2(2"—-D2") | lim un = 2. 
3-15 (a) O(%”); (δ) OC); (©) Οὐ); (ὁ) OM); (e) OG). 
3°23 (a) μι = 1, μῷ = 3, us = 1-7778, us = 1:6799, us = 1°7748; 
(Ὁ) μι = 2, μα = 1.625, us = 1:7592, us = 1°6874, us = 1:7217; 
34/5 = 1-7100. 


3:25 To prove {vam+1(*) — vem(x)} is a null sequence use the fact that 


xemtl xem x 2m 
| vam+i(x) — Vam(x) | = “5am Z2m-1 = 5 |x -- 2]. 


This tends to zero as m --» 00, provided | x/2 | < 1. To sum the series, first rewrite 
the inequality in the form 


vm) 21-2 (1-342 πα ἡ τὰ =) 


and then sum the bracketed geometric series to obtain 


x 2m—1 
vam(x) Σ ἢ zal (-3) | 


2 ἘΧ 
Hence, if x > 0 and | x/2|] - 1, 


— X 


fi < : 
rig or O<x<2 


lim vem(x) = e-* > 


"3 qr2 
3:27 (a) 1; (Ὁ) 4; © Ἶ + sin re (4) 0; (ὁ) infinite. 


3°29 (a) all x; (Ὁ) (--οὐ, —1), (—1, 1), (1, ©); © all x; (d) (--οὐ, —4), (—4, ὦ); 
(e) everywhere apart from the infinity of points x = ππ|2. 


3-31 The polynomial expression assumes the correct functional values when x is equal 
to xo, x1, Or x2. Hence if x1, x2, and x3 are close together, the graph of this poly- 
nomial must in some sense approximate y = f(x) in the interval xo < x < xe. If 
y = sin x for0 < x < 3a, and Xo, x1, x2 are taken so that xo = 0, x1 = 27, xg = 37, 
then the interpolation formula would suggest that f(x) is identically zero. 


3°33 f(0-9) = 0°7931, sin 0-9 = 0-7833, f(1-5) = 0-9936, sin 1-5 = 0-9975, 
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3:35 (a) uniform rod; (b) two thirds of rod is uniform with density ρὲ and one-third 
of rod is uniform with density pe; (c) linear variation of density along rod. 
pi. for x*+y?<R, x <0, 
337 (a) f(x,y) = | ᾿ 
p2 for χϑ τ γέ “ΚΑ, x>0. 


(0) On the assumption that the disc is centred on the origin: 
p for R < (x? + y?)!/2 < 2R, 
(x,y) = 2(χ3 4 yp2/1/2 
[Oy = ; [ - er ἘΞ δὴ for 2Κ < (x? + y®)!/2 < 31. 
pr for y>O,ysx,0<x <L/V/2 


ce) fy) = 
(c) fr, y) . for y<0,y>—x,0<x <L/y2. 


3.39 (a) No neighbourhood exists because of infinities along x = —1 and y = 2; 


(b) Interior of circle x? + y? = 1. Defined at P but undefined everywhere on the 
boundary of this circular neighbourhood; 


(c) The entire (x, y)-plane. Defined everywhere in neighbourhood, including the 
point P. 


2 
3:41 Possible examples are as follows: (a) (x + 1)(4 — y); (b) sin (Ξ - »}: (c) 


sin ΞῸ 
y+3} 


3.45 (a) 1/2; (Ὁ) 0; (c) cos x; (4) 1/32; (e) lim | sin x | = —1; lim Feed = I, 
x—>-0— x x—0+ x 


3°49 (a) e-!; (Ὁ) e-2; (Ὁ) e712; (d) e?. 


Chapter 4 


4-1 (a) 4 ft/s in a south-west direction; (b) 2-5 ft/s in a north-east direction; (c) 5 ft/s 
due east. 


43 (a) £67; (Ὁ) t13V3; © +5; @ tiv2. 


45 (a) 1, -1, iV2, --ἰν2; (0) i772, —ivV2, i¥3, -—iV3; © V2, --ν2, V3, 
—/3. 


4.7 (a) z= —ll +7; (Ὁ) z= —3i; ()z=0; (d) z=44 171. 

49 (4) b= -3; (0) a=1,5=4; ©a=3,b=7; Gda=6,b=5. 
411 (a) Ζι +272 =7+6i; (b) 21 +22=—i; (C) 71 +22 =0; (ὦ) 214+ 22 =2. 
4-13 (a) ΖιΖε = —1 - 51]; (Ὁ) z1z2 = 34; (0) z1z2 = 3 + 4ϊ; (a) zize = 18 — 2i. 
415 —. 
4:17 (a) z1/z2 = (1 + 52; (Ὁ) zi/zz = 3; (ὦ z1/ze = 2]. 


4:23 Roots arez=2+4+ 31,2 =2—-—3i,z=1,z= —1; 
P(z) = (22 — 42 + 13) — De +1). 
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4-25 (a) Ζι +22 =14+ δὶ, Ζι -- Ζε Ξε 3 +i; (b) 21 +22 Ξε 7 -- ἰ, Ζι -- Ζ = --1 +i: 
(c) 21+ 22 ΞΞ 3, Ζι —z2 = --3 - δὲ; (4) Ζι + 22 = —2, Ζι — Ζ = —4i. 

4:33 (a) |z| ΞΞ 5, argz = 2217 radians; (Ὁ) |z{| = 5, arg z = 4-067 radians; (c) 
| z| = 3/2, arg z = 3n/4; (d) | z| =4, argz = 5-761 radians. 


4.37 (a) | z1z2| = 3/2, arg z1zx = π|2: | z1/z2 | = 6, arg (z1/z2) = —n/6; 
(Ὁ) | z1z2 Ϊ = 8, arg Z1z2 = —7/12: | Ζι [29 | =2, arg (21 /z2) = —7n/12; 
(c) | z1z2 | = 2. arg Z1Z2 = --7π|4; | zi/ze | = 1/18, arg (z1/z2) = 5π|4. 
4:39 sin 70 = 7sin θ — 56 sin? 6 + 112 sin5 0 ~ 64 sin? @ 
cos 76 = 64 cos? 6 — 112 cos5 @ + 56 cos? 8 — 7 cos θ. 


4-41 220 = —219(4/3 + i), 


_ 2Κπ οτος 2ζπ 
4-43 We = cos "τ + isin —-; CS 0.13 ¢3.6. 
"1 + 6k 1 + 6k 
4°45 wy = 21/4 cos ( τ Ἵ π + sin ( τ | k = 0, 1, 2, 3. 


4.47 The same table except for the heading of the last column in which R — H must be 
changed to L — H. 

4.49 (a) | OP | = νό, 1 = 2/6, m= —-1/V6, n= —1/V6: 01, = 0:625 radians, 
θ5 = 1-992 radians, 03 = 1-992 radians; 
(b) | OP | = 275, /= 2//5, πη τεῦ, n= 1//5: 61 = 0-465 radians, 62 = ἐπ, 
63 = ΤΊ 108 radians; 
(Ὁ [ΟΡ] = νό, 1= -᾿Ἰν 6, m=2//6, n=1//6: θι = 1-992 radians, 
62 = 0:625 radians, @3 = 1.150 radians. 


4:51 (a) 0; = ἔπ, 62 = dn, 03 = ἐπ: (b) 61 = 02 = 03 = 0-956 radians; (c) 61 = 1:231 
radians, 62 = 1.911 radians, 63 = 0.490 radians. 


4.53 0; = 1-231 radians, 02 = 0-842 radians, 03 = 0-842 radians. 
455 (a) OP =i+j+k; (b) OP = —2i + 3j + 7k; (c) OP = 3i —j + 11k; 
(ὦ OP =j. 
457 a= —i + $j] + 2k. 
4:59 (a) ἃ - b = 21] — 4j + 4k, a — b = 4i — 2k; 
(0) a+b =i — 2 +k, a —b= —3i + 6j — 3k; 
(Ὁ) a+b=2i +] — 2k, a —b = —2i + 3] — 4k. 
461 (a) [ΑΒ = V5, ΑΒ τοὶ -- 2; Ο [ΑΒ] =3, AB=-i+2%j+2k; (© 
| AB | = 2/3, AB = —2i — 2j — 2k. 


2 1 3 
= — ----φἨ[ + — k : 
90: ἰὼ : ve ταν ~ Via! 4/14 


* 
3 


Be os 
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—11 
4.69 (a) a. b = —11, θ. = arccos fi 
(Ὁ) a. b = 4,0 = arccos fet > 
34/34 
(c) ἃ." = —28,0=7. 
4-73 (a) 8; (Ὁ) 56; (c) 32; (ὦ 0. 
4:75 (a) 1: (Ὁ) —10; (0) 0. 


Τὶ -- 5j + 2k —i — 6) — 2k 
eon MA): @ (“5 


Results are unique apart from sign since if n is a unit normal, then so also is —n. 


4.87 Many different forms of proof are possible. The most elementary involves the ex- 
pansion of (a Χ b) Χ ¢ in terms of the components of a, b, and ς followed by 
rearrangement to obtain (a .c)b — (b. c)a. 


489 r= (2i+j—k) + λ(--3ι + 3k); 1 = --Ξϑ ν΄ 18, m =0,n = 3{ν 18. 
4.93 p= / 24/11. 
495 χ - γ Ὁ 2- 3. 


=. 


4-99 If the equation of a plane is written in the formr.n = q, where |n| # 1, theng is 
proportional, but not equal, to the perpendicular distance of the plane from the 
origin. The actual perpendicular distance of the plane from the origin is q/| m |. 


4101 (x — 2)? + (ὁ -- 3935 4+ @- 1)? =9. 
4:103 (/6x — V6 — 4)? + (νόν — V6 + 2)? + (νὸς — 21/6 — 2)? = 24. 


] | 4 3 
4:105 Resultant = — (Si + 4) + 3k); 5 units;  ΞΞ —-, m = ——, 1 = ——. 
2 | ) | τ I ἫΝ n 502 


| 4:97 arccos ( 2 
| 


4-111 305/1/26 ft Ibs. 
4113 M = —1li + 7] — 15k. 


Chapter 5 


5-3 |x| is continuous at the origin but its left-hand derivative there is —1 and its 
right-hand derivative is -+-1. 


5:5 (a) left-hand derivative is —3, right-hand derivative is 4; 
(b) left-hand derivative is 2, right-hand derivative is 2. 


5-9 α = 1, b = (61/3 — 27)/12. There is a unique tangent at x = 7/6. 
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5-11 (a) f(x) = 4x28 — 3sin3x for --ξπ <x <0 and O<x <7; f(x) is non- 
differentiable at x = 0. 
(Ὁ) f(x) = sin 2x + 2x cos 2x + 3x?/3 for all x in [—1, 3]. 
(c) f(x) = — sinx for 0 <x - ἐπ; f(x) =sinx for ἐπ <x <7; f(x) is non- 
differentiable at x = ἐπ. 
5°13 (a) y’ = ἐχ 3} sin x + x1/3 cos x; 
(Ὁ) y’ = (2x + 3Χ1 + cos 2x) — 2(x? + 3x + 1) sin 2x; 
(c) y’ = 6cos 6x cos 2x — 2 sin 6x sin 2x; 
(ὦ y’ = (3x? + 2) cos 3x — 3(x3 + 2x — 1) sin 3x. 


515 (a) γ' = 3(x + 1)(x? + 2x + 1); 
(Ὁ) y’ = bx%(a + bx8)-2/3; 
(c) y’ = 30 cos 2x(2 + 3 sin 2x)4; 
(d) y’ = 6x? cos (1 + 2x); 
(e) γ΄ = 2x cos (1 + x?) cos [sin (1 + x?)]; 
(Ὁ γ' = —2x3(1 + χοῦ 1} sin (1 + x4)¥/2. 


: —6 sin x 
517 @ Y= cos τ 
; 1 x? ; 
(0) y= a2(b2 +. x2)1/2 τε αϑί δ + x2)3/2” 
ithe 2x(1 + 2x?) sec? (1 + x2 + x4) 2x cos(1 + x?) tan ([ + x? + x4), 
ae sin (1 + x?) sin? (1 - x?) ; 
(ὦ y’ = —6cosec? (1 + 3x) cot (1 + 3x); 
(e) γ' = —4/(sin x — 2 cos x); 
, . Gx? -- 2x — 12) 3x — 1 
0 2 ae om (BF) 


5-21 (a) f(x) is discontinuous at | x | = 1, so that the intermediate value theorem will 
not apply to any interval containing this point. In particular, this is true of the 
interval [0, 6]. 

(b) f(x) is continuous in [—I1, —2] so that the theorem applies. f(x) = —0°5 in 
[--11, —2] when x = —3. 

5.25 f(x) = sing 
Hence x = 47 is an absolute maximum of f(x), though there is of course an infinity 
of other points x at which f(x) also attains the value 0.5. 


5-25 Critical points at 1 = (1 + /13)/3 and & = (1 — +/13)/3. The point £1 corres- 
ponds to a minimum and the point 2 to a maximum. 


cos 5 = }sin x 80, setting xo = ἐπ, we have f(xo) = f(x) for all x. 


5:27 Critical points at €, = 0, ξ = 3, and &3 = 3/2. 


5:29 (a) one at x = 0; 
(b) five; 
(c) one at x = 0; 
(d) None. Derivative is discontinuous at x = 0 so theorem does not apply. 
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5.321 & = arccos 27. 
5°35 28/2 + 3./2(x — 1) < (1 + x?)8/2 < $3/2 + 6/5(x —1) for 1<x <2. 
5.39 (a) 72/2; (Ὁ) 5; (ὦ 3; (ὦ --ἰ. 
5.41 (a) 0; (Ὁ) 1/5; (Ὁ 0; (ὦ 3; (©) 2|π; (ὃ --ἰ. 
5-45 Minimum at x = 0-4. 
5.47 dV = 3n[aRo?Ho(1 + at)? — Broho(1 + βι)3] 4. 
551 lim f(x) = lim f(%) =5; lim f’(@) = 14, lim f’G@) = 10. 
e—1 — 1+ e—1— 21+ 


5.53 (a) x = 1 is minimum; x = —2 is maximum; x = —4 corresponds to a point of 
inflection with gradient —27/2. 


(b) neither maxima nor minima; point of inflection with zero gradient at x = 0. 
(c) minimum at x =0; maximum at x = 6; minimum at x = 12; points of 


inflection at x1 =6+ /12, x2=6—+/12 with corresponding gradients 
2xi(xi — 12)(2xi -- 12) for i = 1, 2. 


5.57 (a) ἔς = 2x/y, fy = —x?/y?; 
(b) fe = 6xy + (ὁ + y)? + 2x(x + y), fy = 3x? + 2x(x + yd; 
(c) fe = 2x cos (x? + y?), fy = 2y cos (x? + y’); 
(4) fz = cos (1 + x®y?) — 2χϑγϑ sin (1 + x*y?), fy = —2x3y sin (1 + x®y?). 


5.59 (a) fe = 2xyz -- sae fe Ἐν, =x > 
(Ὁ) fe = cos yz — yzsin xz — yz sin xy, | 
fy = —xz sin yz + cos xz — xz sin xy, 
fz = —xy sin yz — xy cos xz + cos xy; 


(c) fe = —(2x + y) sin (x? + xy + yz), 
fu = —@& + 2) sin (x? + xy + yz), 
fe = —y sin (7 + xy + yz). 


—2 —1 —1 
5:63 (a) du = [τ dx + [Ξ 5) ἂν τ [τ «ἘΞ : Ἐ.}} dz; 
(Ὁ) du = sin (y? + 2?) dx + 2xy cos Ὁ + 2?) dy + 2xz cos (y? + 2?) dz; 
(Ὁ du = —3(1 — x2 — γὲ — 22)/2(% dx + y dy + z dz). 


ss 1641 «Ὁ {{ππ πος tae ta | (ES) 1 


va [αξ Ide]; |d4| < 00573. 
(s — ὦ 


ὃ ὃ 
5.67 (8) = = ees 
Χ Ζ éy Ζ 
oz —yz oz —Z 


ox “Ἂν + 2xz cos xz2) ay ΠΟ + 22 cos xz)" 
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ὃ: sin x — COS 6z xsiny ~ cosz 
(d) — ΒΞ att δ ες cite AEDES italic 


ax cosx-—ysinz ὃν) cosx —ysinz 
5.69 y? = 6x. 
5.71 lines x = 0 and y = 0. 


t+ (i + 2%)*/2 cos (St? + DI 


du 
5.73 (a) 7 = 2[(1 + £2)1/2 + 4rcos (St? + 1] + 2} G+ pie a 


(b) = 311 + 20 +)? + P(r + 20 + 2¢) + 32°); 
ἄπ 2t 
(c) ay —— 3" 


2 ee 
ii! Beane Se 
dx (cos x — x sin y) 
(oy MY = Caxy? + cos xy) + (2x? τ δὲ οὐ pe 


5-79 f, = ἔς 51πθ cos ¢ + fy sin θ sin ¢ + fz cos 9, 
fo = rf cos 9 cos ¢ + rfy cos θ sin ¢ — γῇ sin 6, 
fy = —rfz sin 6 sin Φ + rfy sin θ cos ¢. 
When f(x, y, 2) = x? + 2xy + yz + z? use the results: 


. ΞΞ 2. - )»), 3 Ξξ χ Ὁ 2), and fe =(y + 22). 
5.81 Use transformation x τι rcos#, γ τε ἐβίη θ, 2 Ξξξ, Ζ' and make identifications 


X1 =X, x2 ΞΞ 7, Χ =z ἃπά αἱ =r, ας = 6, «3 =z’. It then follows directly by 
substitution that a(x, y, z)/a(r, 9, 2’) = r. 


5-83 (a) ax, y)/@(u, v) = —13 everywhere. (b) a(x, y){@(u, v) = 8uv which vanishes if 
u=Oorv=0. (ὦ x, yu, v) = —(2u + 20), which vanishes if u + v = 0. 
cu —(u+v) ὃ. dv — 3u 

ax uw pe 2 oy 7 2(u2 + v2) 

ὃ uu ὃυ 2u - 3v 

ὃχ uz + vw? dy ~ 2(u2 + v2) 


5°85 


5-89 (a) Yes; (Ὁ) No; (ὦ No; (d) Yes. 


x ax? ; 
5:91 (a) f(x) = 2x arcsec ( + ix 1G? — at)’ 
ἬΝ 2x +1 yes 2x(x? + x + 1) ; 
OT arcsin (x2 ~ 2)  [arcsin (x? — 2)}?/[1 — @? — 2)5}᾿ 
ΜᾺ -- 4x?) — 2 
Hy) = 8 ia: δὲ i, 
(c) f(x) = = (1 + x + arccos 2x) ( Va — 4x8) 
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593 Att=}e, Var, Wu -o, 
dx 


5.95 Gradient is infinite at the origin so that both envelopes passing through that point 
are tangent to the y-axis. 


5.07 fexQ, 1) = 384, feyG, 1) = 192, ννί(, 1) ΞΞ 384. fry = ἔνε everywhere because of 
Theorem 5:24. 


5.101 03z/oxdy? = 8x3 — 4x3 sin x2y --- 2x5y cos x®y; 
z/ox*dy = 24x°y + 2 cos x®y — 10x?y sin x2y — 4x4y? cos x?y. 


Chapter 6 
6.3 Find integer R > 4 so that 2%-1/(R — 1)! < 0-01. Inequality valid for R > 8 so 
at least 8 terms required. Then, 
e? — ] — 2 — 22/2! — 23/3! το τ. — 27/7! < 0-01. 
65 (a) #; (Ὁ) t; © ἐ. 
6.17 (a) 2e2%/,/(1 — e4*); (Ὁ) (ες + xe* + 1)/2\/(xe* + x); 
(c) (1 + x)e* cos (xe? + 2); (d) 2e%/(e~ + 1)?. 
6:9 (a) 6(1 + 6x2)e3*"; (0) 462: cos (1 + 625) — 4e47 sin (1 + 632); 


(c) (cos? x — sin χ)εῖ 2, 


af y of ι΄ 
611 — = -- -- eM" να cog γίχ, -Ξ = -- εϑἷῃ YX ρος y/ x. 
Ἢ ΞΞ γὶχ ae e ylx 


d 
6-13 τ = (7x — 2γ)ε τ +7y, 


615 (4) Ὁ; (Ὁ) tlog2; (c) log3; (ὦ) log 3/2; (Ὁ 1. 
6:17 (a) 7/2; (Ὁ) 3; (ὦ --ἰ. 


6:19 (a) (1 + log x)x*; 
(b) (log sin 2x + 2x cot 2x)(sin 2x)*; 


(c) (co x log x + sn) xsin x, 
(4) (cot x log 10)10!08 sin z, 


621 dz = ( (x dx + 2y dy). 


2 
6.23 τ = γχίχγ) ", > = χζίχ γ)ξ 1, “ = (xy)? log xy. 
6.31 (a) 2 cosh x(cosh 2x cosh x + sinh 2x sinh x); 

(b) 3 sinh 3x exp(1 + cosh 3x); 

(c) sech x cosech x; 

(4) —2x/{(x? + ὃν — @? + 47h, <a <4; 
(e) 2 cos 2x sinh (sin 2x). 
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6:35 (a) ν2εἰ ἐπ: (Ὁ) V/2e~*7/4;, (Ὁ 16e—7/38; (4 Be7™/2; (ὦ 4/13 ~with 


a = arctan 3/2. 
6-37 cos? 6 sin? 6 = ἢ sin θ + τἷς sin 30 — 3g sin 50. 


6:39 z = (2n + 1)ἐπ. 


Chapter 7 
1p = ¥ _ = 
7-1 SPa = A(b - a) [ ao es Sp, ms Ab us a) [ EY (n 1)(b | 
7 2n 2n 
' one A 
lim Sp, = limSp, = 5 — a?) 
7.3 Sp, = (ε΄α — οὐδ) [ = za) where A = (ὁ — 4)". 


Since, by L’Hospital’s rule, lim {Δ|( — e*4)} = —1/A, 
A—0 


b 

1 

we have [ εἷς dx = lim Sp, = 5 (εὖ — εὖ), 
a n> © λ 


7.9 1(6-13 — e-4), 

7:13 1 = 7. 
© dx 

715 Required area is a triangle + | ae 4+1 = 3/2. 
1 


717 ὃ = Vf is unique in first case. In second case & is not unique, for € = 42/3. 


T-19 Possible examples are: 


> 


1ἴογ 1 <x <2; for this then requires an ξ in [I, 4] for which 


x) = 
79 ᾿ for2<x <4: f(§) = 5/3. Such a number ((ξ) does not exist. 


2fori <x <2; for this then requires an & in [1,4] for which 
f(x) = 3 for2<x <3; f() =3. This is true for any ξ in the interval 


4for3<x<4: 2<&<3. 


ar’ 
7214474 re 
1- αἢ 


7:23 2ae—+2") cos a(l + a®) -- ε΄ cos a? — a Ι e-* sin ax dx. 
a 


$(b) 
729 S = 2" | φ)ν ἃ + [ΦὉ)}") dy. 
φία) 


#(b) 
731 Ν τι π { [Φ(}}} dy. 


φία) 
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Chapter 8 
81 (a) —3 arceoth 5 +C; (ὃ) —kcos3x+C; (c) 3 arctanh 5 + C for χ <9: 


} arceoth = + C for χ >9; ὦ arctan 5 ἘῸ; (Ὁ dysin4x + C; (ἢ 3#/log 3 
+C. 


3 
8-7 (a) = —~3cosx+x+C; (6) 4flog4 +sin2x+C; (© 4coshx — cos x 
1 
ες, (ὦ a + 3x + C. 


2 
8:9 3 arccos (=) for yt. 


8-11 arctan ν (φοβῇ χ -- 1) + C. 
8:13 33x? + 1)8§+C. 


8-15 -Svi — x? + αὶ arcsinx + C. 


8:17 4/(x? — 1) — arccos (1/x) + Ὁ. 
8-19 8/105. 
8.21 log 121/25. 


i 
8-23 ea (: = 3) ΓΕΡῸ 


az 
8:25 (πη x cosh x — cos x sinh x) + C. 
8:27 x log? x — 2x logx + 2x +C. 


a2 
8.31 — — 2. 
4 


3 


+ C. 


x—3 


8.37 x + log a 


8 27 30 


5 
se αν oe eh et τυτ 
ie 5) Δα. * 343 8 


X= 
8.39 ----- C, 
Ἐπξ} 


ΣΕ δ 
χ--ἴ 


τὰ τ τς 
χ-ῖ 5 


2x — I Ζ 


x — 
bx a pe + 9 8 


2 
sa lac 


8-43 
x+1 


2 + tan x/2 


i 
BAS 95} — tan af 
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tan x/2 — 5 
: ] Ὡς ; 
i tan x/2 — 3 | μων 


(0 aden Ὁ τ- 
δ aidan 5 + C, 


x 


1+ νὰ — x) ate 


8-51 log 


8-53 arc sin ΕΞΞῚ + C; x > ν2. 


57 eH. 
8-57 πεν ταὶ — x) + 


8.59 4sinx + 4sin3x + C. 


861 sin x ΕἸ sin 5x 1 sin 7x " 
2 20 28 + 


8.63 1/1 -- }: «1; divergent if A 1. 
8:65 7. 
8:67 2. 


Chapter 9 
9.3 (a) 9; (0) 0; (ὦ 15. 
9.5 (a) Yes (7 X 9); (Ὁ) No; (c) Yes(1 x 1); (d) Yes (3 x 4). 


9.9 The result asserts that a rotation θ followed by a rotation ¢ is equivalent to a single 
rotation θ + φ. 


9-11 (a) Equal if a = 1, b = 2, c = 4; (Ὁ) Cannot be made equal because no solution 


to the two equations a = 1,a* = 4; (c) Equalifa=1,6=3,c = --ἰ, 
4 
3 3 0 "Ὁ 
9:15 AB=|7 5 5 O], CD = sail 
3. 4. 2 2 


9-17 BAX = BK -- IX = BK > X = BK so x1 = 15/8, x2 = —3/8, x3 = 1/4. 


cosh2x sinh 2x cosh 3x sinh3x 
sinh 2x cosh 2x |’ sinh 3x cosh 3x 


9.19 A? = 


cosh χ sinh nx 
and Av =] | : 
sinh wx cosh nx 


9-25 (a) —1; (b) 3; (ὦ 0. 
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9:27 (a) Row 1 — 10 Row 3, | A| = —31; (Ὁ) Remove factor 3 from Row 1 and a 
factor 2 from Row 2, | A| = —18; (ὦ Column 3 = Column 1 + 3 Column 2, 
|A| =0. 

9-31 Mir = —}, Mio = ὃ, Mis = —%, Mai = —%, Mee =}, Moa = ἢ, Mai = %, 
Mse = ἃ, Ms3 = 4; Ait = —}, Ato = —%, Ais = —%, Aoi = 8, 422 = 4, 


Aes = —~—%, 481 = §, Ase = --ἶἴ, 4,33 = }. 
9-37 (a) linearly independent; (b) linearly independent; (c) linearly dependent be- 
cause Row 4 = Row 1 + Row 2 + 2 Row 3. 


941 (a) [ 6 1 —-5] ()[-7 6 -1] ΟΓ ad -Ί. 
—2 -5 4 | 2. 0-1 ΙΞ 


ς a 
—3 3 —1 1 -- 1 
1 2 3 
9.43 Α“: = 2 5 7 |. 
—2 —4 -—§ 
dA | 62 sec?¢ —sint | 
45 — = - Ἴ : 
9.45 F ᾿ e \ | for ie < ἐπ 


9.49 x1 = 3, x2 = 2, x3 = 2. 
9-51 x1 Ξε 1, χοὸ =2,x3 =1,x4 = --ἰῖ. 


9-53 Rank 2: x1 = (22 + x3)/19, x2 = (5. + 8.χ3)19. 


9-55 Trivial solution X = 0. Nontrivial solution if « = 2 ora = --ἰ (repeated root). 
Ifa = --ἰ, ky 
X= ke for arbitrary ki, ke; 
—ki —ke 
If a = 2, k 
X= [Κ| for arbitrary ἄ. 
k 


9-57 (a) unique solution; (Ὁ) inconsistent; (0) consistent; (d) infinity of solutions. 
1 I 
9-59 A, = 2,42 = —-1: Xi =p ac Xo =p ‘ for arbitrary scalar jz. 


963 A’ = A-}_ In geometrical terms the result asserts that lengths remain unaltered by 
this transformation. 


9-65 Parametric equations of curve are X = —3x, 9 = 3(x? 4+ 2x 4+ 1). Reflection about 
y-axis and a 3 times magnification. 


9-67 = 3x? cosh (x3 + y3) 3y? cosh (x3 + y3)] [dx 
dv} |. 3x2 sinh (x? — y3)  —3y? sinh (x3 — y3)] {dy | 
‘No inverse when x = 0 or y = 0, or when x = y, for then the Jacobian vanishes. 
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Chapter 10 


10-1 (a) not convergent; (Ὁ) y = 3 + i, nota member of {zn}; (c) y = 1 + δὶ, y nota 
member of {zn}; (d) γ = e(e + 1), y not a member of {zn}; (6) y = 0, y is a 
member of {zn}. 


10-3 lim (Wn + Ζη) = 4 + 5i/3; lim (wn + zn) = 7/3 + 31; lim (wn/zn) = 11/30 + i/10. 


10:5 « = 2 gives rise to a single limit point at 2/3 + i; α = 1 gives rise to two limit 


points at 2/3 + i and at 2/3 — i. The number « is not unique, since « = —1 would 
give same result as « = 1 and any even « would give rise to the same result as 
α = 2. 


10-7 (a) semi-circle of radius 1 centred on origin, and in upper half plane. 
(Ὁ) ellipse centred on origin with semi-axes (a, δ). 


(c) rectangular hyperbola to right of y-axis and symmetric about x-axis with apex 
atz=1. | 


(4) circle radius 3 centred on z = —2 + i. 
(e) circle radius 4 centred on origin. 


10-9 Region in annulus in first quadrant between circles of radii 2 and 3 centred on origin. 
Points on radial lines to be included and points on annular boundary to be excluded. 


10-11 The curve is that part of a circle of radius 1 //2 drawn about (—5/2, 4) as centre 
that lies above the x-axis. (i.e., drawn with (—2, 0), (—3, 0) as a chord which 
subtends an angle of ἐπ at circumference.) The region is semi-circular and is interior 
to this curve and to the line y = 4 which is a diameter. Boundary points are to be 
included. 


10-13 Οἱ is curve arg (z — 1) — argz = ἐπ; Οἱ is curve arg z — arg (Ζ — 1) ΞΞ ζπ. 


10-15 (a) all Ζ; (Ὁ) for z 42; (c) all z; (ἃ) all z because sinh z = sinh (x + iy) = 
sinh x cos y + icosh x sin y. 


1017 (a) 7+ 41; (Ὁ) (2 + 41)/5; ( 54 — 4). 

10-21 (a) 12; (Ὁ) —7—i; (ὦ ἱ -- 3. 

10:33 (a) Yes; (Ὁ) Yes; (c) No; (4) Yes; (0) No; (ἢ No. 

10-35 (a) a=2,b=1; (Ὁ) αΞῷ.Ἡ 3,0 ΞΕ]. 

10-37 (a) Yes; (b) No; (ὦ Yes; (ὦ No; (e) Yes. 

10-39 (a) f(z) = 323 —i; (Ὁ) f(z) = sinhz+zcoshz; (0) (2) = ae”. 


d 
10-41 μ = 2x(1 — y) + const; Σ = 2(iz - 1); ν = 2z(iz + 2) + const. 


d 
10-51 Speed |g | = (uo? + v0?)/?; streamline uv = μοῦο; equipotential u®? — v? = 
uo? --- υοῦ. 
Chapter 11 


11-1 (a) constant pitch helix with elliptic cross-section; (b) variable pitch helix with 
elliptic cross-section lying entirely above (x, y)-plane; (c) curve with parabolic 
projection on (x, y)-plane and cubic projection on (y, z)-plane. 


11-3 (a) continuous; (b) discontinuous because of k component; (c) continuous. 
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115 (a) τ = --2απ sin 2nti + 2δπ cos 2π|) + k, 


d?u 


a —4an* cos 2nti -- 4bn? sin 2ntj; 


du 
Ὁ — =i42z0 12k, 
(Ὁ) τ i+ 27 + 3 


eo 
11-7 TO) =i, Τῷ) = ὙΠ (i + 2| + 3k). 


ἴεν - A τος ΠΟ ον 
gs (ΞΕ) ἘΞ Ye k), N= -j, = i, = —4. 


me ee ath ee 
i+2j+3k ,_3-3j+k ὼς τῆι! - 8 + 9k 


4/14 J/19 ” 4/266 


11:13 T= 


t4 
11:19 (a) ὁ sinh 2ri + log ej + 7k + Ὁ; 


(Ὁ) [ὦ — 2?) cost + 2¢sin fli + e'j + dogs ~ Dk + C. 
2 
11-21 (=) = —{)2r2 + const. 


11:25 29/28. 
11-27 (a) 8; (Ὁ) 8; (ὦ 8. 


τος ucos 6 
11:29 transverse velocity = ——_———_; 
cos 6 — sin θ 
—u? cos 9 
ae® (cos θ — sin 8)? 
2u?(cos 6 — sin 6) + 2u? sin 6 cos θ 
ae*(cos 6 — sin 6)? 

11-31 Jj, = —-1, Je = --ῶ2, Jag = —3. 
11-33 (a) Yes; (Ὁ) Yes; (c) No. 
11:35 (a) #(6x + y?)i + §(2xy + z)j — ἐγκ; 

(b) §xyzi + §(x?z — sin y)j — 3x?yk; 


() Zn. ὦ 2 
cd -—-—i - -—— 
3x%yz 3xyz 


radial acceleration = 


transverse acceleration = 


| 


— k, 
3xyz? 


11:39 ὁ = $(x? + xyz?) + const. 


I 
11-41 sae PY ma! aa x+4y—2z=8. 
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Chapter 12 
2n + | Ι.3,5.. (2a 1 
τὰ 5 {}ΞΞ > (0) Sl 
n 3H a5! 1.4.7...@Qn —2) 
(d) dams: = 1/2"41, aon = 1/3"; (0) an = 1/n( + 2). 


12:3 A712 ind sl ae : 
(a) 47/12, remain ἐγ τ 4 +4 5} 3 


b) 17/12 ἘΠΕ fol eee ol 
(b) /12, remainder 5 11:5 21}: 


12:7 (a) divergent; (8) convergent Re < 1/log 6. 


4/2.5.8.11.1 
12-9 (a) convergent, Ra < 3 ( 7 


(c) divergent; (4) convergent, Ra -- 539/5!; (ς) divergent. 


12-11 convergent, 0 < | Rio | < 1/227; (b) divergent; (c) convergent, 0 < | Rio| < 


1 /1\12 
πᾳ) 


12-15 (a) -e<x<e; (Ὁ) -4<x<4; () -w<x< 0; (4) -- ὦ -χ -ο; 


(0) -l6<x <2. 


|] 1 1 1 
12:17 —=+<x¥< 53 aa ee See Oe ——<x<-—: 
(a) 5 x 5 (b) x (c) 3 x V3 
12-21 n athe. Seco eee eee radius of convergence 
MOM Le aia Veg et 5 ge ea : δ 


r = I, interval of convergence —1 < x < 1, divergent at x = +1. 


C3 | Cs Cc? 


12:23 C ~ —- 
3.3} 5.51 7.7! 


+: τ“ convergent for all C. 


Ὁ (—1)8-122-1y0 


12:2 —————__———- , —O <x< 0} 
Pe) eS (7 — 1)! ei 
bes 1 Qn +..9n-1 
stay (ZEN) ~o <x < 0; 
n=1 Nn. 
Ix? 1.3 χῦϑ: 1.3.57 
CO) Sr a ee ee δος ~l<x<l. 


23 2.45 2.4.67 


16 24 24 
1227 549% —D +57 — DP + & ξ P+ — 1 


3 3 
12:31 (a) = cosh(a + ἐ), α <<a +x; (b) τ τὶ coat ἔ)α «-ξ «α Ἐκ. 


1/2 
12:33 | et — Ps(x)| < 4)" = forO <x <}. 


12-35 0-7721. 
12:37 3-14. 


ay ae 
Se tb R 
1.5.9.13. s); PPA SNe (; va) 


——— 


12:39 f(x, y) = y + xy + χὴν — y3)/3!. 
12-41 472/e. 

12-43 3(log 2)2/2. 

12:45 1/2. 

12-49 4-5826. 

12-51 (—1, 0), (1, 2), 1-400. 

12:53 1-895. 
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12:55 min at (0, 0), saddle points at (5, 6), (—5, 6), (5, —6), (—5, —6). 


12-57 (0, 0) saddle point; (, 1) minimum. 
12-61 minimum at (18/13, 12/13). 
12°63 Y=0-01 + 1-0lx. 


Chapter 13 


13-1 (a) order 3, degree 1; 
(c) order 2, degree 2; 
(6) order 3, degree I. 


13:3 mdv/dt = —mg — λυϑ, 


d 
13-5 (a) x + 2y 2 =0; (Ὁ) yx + y') — y =0; 


dx 
(c) 2xy’ —3y —x = 0; (4) (1 + x)y’ [ + og ( 
(e) y” — 3y’ + 2y = 0; (f) γ΄ — 4y + 4y = 0. 


13-6 (b) No; (d) Yes; (f) Yes. 

13-7 (Ὁ) Yes; (d) No; (ἢ) Yes. 

13:9 2:4354, —3 per cent. 

13-11 1-1404, —3 per cent. 

13-13 Node. 

13-15 (a) No; (Ὁ) Yes; (ὦ Yes; (d) Yes. 
13:17 xy’ —y =0. 


Chapter 14 
14-1 (a) cosec? y = sec? x + C; 
(c) x? + y? = log | Cx? |; 
(6) tany = C(l — e”)8; x =0. 


I 


A 
B 4 


Jon 
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14-2 (a) y = x/{log|x| + 8; 
(Ὁ (x — Ο — y? = (3, 


i 


143 (Ὁ) xy + sinxy + x27 =C. 
14-4 (a) w= x2; 3x3y2 + x4y2 + x8y =C; (ὦ No; (e) No. 


145 (a) y = Cx + 2χ3; 
_ 3 + i)4 


(6) y = x arccoth (ic 


(c) y 5 + C(x + 1)?; 
1+x 
Ὁ.) Ξ cos x’ 


I 
(g) y = 1 +. Cez 7/2 
147 y = (1 + dx + 1 sin 2x)/cos x. 
1 
149 y= : fi + e*(x — 1). 
14-11 y = cosh x. 


3 2/3 
14:13 (b) y = ; (Cix + co| : 


ian ᾿ 


(4) x = 1 + log 5 


x2 


14-14 (a) y= [1 +Ce 2)}-; 
1 2 11 a sige 
() y= ; (= -2++3) + Ce | 


14:17 (Ὁ) y= Cx + = envelope y? = 4x. 
14:18 (4) x = Ce? +(p—1), y=Cerl+p)+tp?—-1. 


2 3 
14:19 (a) YOR) =x + cosx; yO) = τ +sinx +cosx; yO(x)=1+ = + sin x; 


- | 
( YOR) = 5 +x, yO) = 5 +sinx, yO) =3 + { cos (sin t) de. 
0 


14.21 If « « 0, line z = ay must intersect z = sin y between π and 2π for positive y. If 
μη . . 5 
α > 0, line z = ay must be tangent to z = sin y at point in 2π <¢ « >" 


14-23 (Ὁ) w(x) = 20.474 ©9:228% — 4-484 x ~ 20-108 
2(x) = 2-922 e0:607% — 1-647 x — 2-714. 
(d) w(x) = 2 —cosx, 2(x) =3 —2cosx. 


25 
14-24 (Ὁ) w(x) = 0°:236e7 — 0-743, z(x) = 0°-266e7 — ΞΞ 
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Chapter 15 


151 (a) P(A) = δ + 5A -- 14: y = Cre + Cre-72; 

(c) P(A) = A2 +4443; y = Cre-8t + Coe; 

(0) P(A) = ΑἯῬ + 7A? Ἑ 12ΔΛ; y = Ci + Coe-3* + Cre-4e; 

y=2 +e 8 + 3e-*, 

(g) P(A) = At — 2/8 — 302 + 4044; yp = Cre? + Coxe-* + Cse?* 4+ Caxe2. 
15-3 The general solution is y = Cye* + C2e?* + C3e3%. 

The Wronskian W = 2e®* is non-vanishing for finite x proving that the functions 

ε΄, δῷ and 655 are linearly independent for finite x. 
15:5 (a) y = (ἰοῦ + Cocos x + C3 sin x; 

(c) y= (Ci + Cox + Cax2)es; 

(ce) y = (Ci cos (2x) + Ce sin /(2x))e* + (C3 cos ν (2Χ) + Ca sin ν (2χ))εξ. 
15:6 (Ὁ) y = (Ci + Cox + Cax2)e7 — 6; 

(d) y = e-22(Ci cos x + Ce sin x) + οἵ sin 2x; 

(ἢ y = Ci cos 2x + C2 sin 2x + 4x sin 2x; 


1 4 
(h) y = e-%(Ci cos 2x + Ce sin 2x) + Ξ et + a 622. 
᾿ Ι 1 
()) y = Cre-?? + Cre3% -- ae 4 30 (cos x — 7sin x). 


15-7 (a) y = Cie + Cre? + dx(x — Ie"; 
| (c) y = οἰ cos x + C2 sin x — 2x cos x). 
15-8 (a) x = Cie’ + Cre*; y = Cyet — Cre. 
(c) x = (Ci + Cot)e? + St -- 9; y = (C2 — 2C1 — 2Corje — 6. + 14. 
15:9 General solution: x = Cye"t + Coe%#; y = —(C1 + Cale + Cre; 
z = 36 Ὁ + Cre?*. 
Particular solution: x = }3(e-* + 2e%4); γ = d(e-* + 2¢2¢); 
Ζ = He? — e-4), 


| 23 24 26 
15-10 (a) yaala- Fats ἔς. " | + 3(x? — ἢ); 


I 
(c) y $x ee 


ΩΣ 
Q 
is 


4 
15:19 X(t) = : sin (ἐ + ἀπ) + sin (32 + ἐπ). 
a a 


4 4 


15:21 This problem arises from the case in which three identical pendulums swinging from 
a common line of support are coupled one to another by identical springs between 
their masses. _ 


wi? =alm, we? =(a+b)/m, ws? = (a + 3d)/m; 
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] Ι 
X()Y = 11 sin(@it + a1), XW) = ΟἿ sin (wat + €2), 
1 —| 
1 
Χ( 3) = | —2 | sin (wat + 83). 
. | 


15:24 (Ὁ) y = #(e* — sin x — cos x); 
(d) y = $[e* — e-*(cos 2x + sin 2x)]; . 


ὦ y= (3-ax43at) ert eas. 


15:25 (Ὁ) y = —xe7?%, z= (1 + x)e-*. 


Chapter 16 


sin nx 


161 f(x) =2 2c Int} -—— 


163 f(x) = --π Ὁ Dik (1 in Fe cSt el cos ms 


4& 2 
16°5 (ise Se Σ cos 2nx 
WT 


mT na 4n? — | 


ΞΟ E rae > (—IN = =a 


169 f(x) = ἐ — ἐ οο5 2x. 


, COS (2n — De 


1 ᾧ a7” 
2n—1 ὙΠ; ieee 


1611 f(x) =- ee 


ΓΞ: Σ sin sin (2n — 1)λ — 1)x 


a oe ae 
πὸ o | 
16°13 — = -. 
9 2 n* 
πϑ το 5] ΜΧ. ΟΟ5 Ηχ 
Ἂ τοὶ τος 4 — | n+1 —- — ——— ], 
1617 f(x) => +4 2 (-) fe - ) 


COS NX 
π᾿ 
2 * 


16°19 f(x) =F +4 S (—1) 
nm=1 


| 137? oo cosmx 2nsin nx 
16°21 f(x) = > 4+ 4 >. (—1)" bea ad | 
nm=1 


H 


το 1 cos (2n — 1)rx 


3 1 
BG 23 λα 2 13 2 (2n — 1)? 3 


sin 2nx 
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2 asin 20x 
16-25 f(x) == ὩΣ. rae 
τ᾿ COS NX 
16°27 f(x) = -Ξ ΣΠ -- - 1)" 67] aL 
n=1 


16°29 Integrate f(t) = 21 + 1 over [0, 1] to obtain 


μ. ΟΝ UX HX 


a2 oO 
gOS ee πῆς 21 


The series for g(x) is not a Fourier series because of the presence of 
the constant term in the Fourier series for f(x). 


2 sinh ἀπ sinh « aX 
16°33 f(x) = i “ἢ + > (-- z=] with f(+«) = cosh az. 
g ω ae (2n + 1)πχ 
16°35 f(x) = -- ἜΣ (—1)" ΝΙΝ 
π2 a, Se ος 
(2n + 1)? 
16°37 The Fourier coefficients for f(x) are: 
a don = 0, a a = etd ὙΠ 
= 4; Gon πο ν, 42-1 So. ππτππτπτρΠ ---: n= : 
0 2 2n-1 ἜΣ On eae Ὠ OF ail n 


The Fourier coefficients for g(x) are: 


2 


2π } 4 
Ao’ = —, an = (—1)" — and b, = 0 for all π. 
3 os 


Index 


Absolute value, 28, 127 
maximum, 199 
minimum, 199 
Absolutely convergent series, 535, 550, 590 
Acceleration, 216, 498 
radial, 511 
transverse, 511 
Addition formulae for hyperbolic functions, 
288 
Aggregate, 1 
Air resistance, 598, 646 
Algebraic function, 58 
Algorithm, 33 
Alternating series, 545 
Amplitude, 683 
Analytic functions, 469, 470, 680 
Angle between two vectors, 148 
Angular momentum, 167, 512 
Angular velocity, 166 
Antiderivative, 322, 345, 670 
of vector function, 505 
Anti-parallel vectors, 143 
Approximating polynomial, 567 
sum, 306 
Arbitrary constants in solution of differen- 
tial equation, 601, 634, 662, 696 
Arc length of definite integral, 328 
Archimedean spiral, 343 
Area 
as definite integral, 303, 307, 311, 343 
between two curves, 314 
density distribution, 344 
in polar coordinates, 343 
negative, 311, 314 | 
of surface of revolution, 331 
Argument 
of complex number, 128, 294 
of function, 41 
Array, 378 
Associative property, 24, 37, 389 
Asymptote, 46 
Augmented matrix, 416 
Auxiliary equation, 656 


Base of natural logarithms, 281 
Basis 

of matrix, 382, 404 

of number system, 29 

of unit vectors, 141 


Bernoulli’s equation, 636 
Bessel’s inequality, 713 
Bilinear mapping, 480 
Binary number, 30 
Binary operation, 8, 22 
Binomial distribution, 19 
Binomial theorem, 40 
Binormal, 500 
Boundary, 27, 450 
conditions, 603, 693 
Bounded function, 56 
sequence, 75 
Bounds 
lower bound, 57 
upper bound, 57 
Buckingham’s Pi theorem, 427 


Calculus 
first fundamental theorem, 323 
second fundamental theorem, 323 
Cap, 4 | 
Cauchy polygon, 606 
Cauchy—Riemann equations, 465 
Cauchy—Riemann theorem, 466 
polar form of, 469 
Centre of mass, 165 
Centroid, 165 
Chain rule, 194, 239, 240 
Characteristic 
determinant, 422, 688 
polynomial, 656 
root, 422 
vector, 423 
Circle of curvature, 524 
centre of, 524 
radius of, 524 
Circulation, 513, 527 
Closed interval, 26 
Coefficient matrix, 384, 414 
Cofactor, 400 
Combination, 17 
Commutative, 24 
Comparison test, 536 
Comparison theorem, 647 ff. 
Complement of set, 6 
Complementary function, 631, 657, 662, 683 


_ Complete primitive, 662 


Complex number, 115 
argument, 128, 294 
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Complex number, (contd.)— 

difference, 118 

equality, 118 

imaginary part, 117 

limit of, 444 

modulus, 127, 294 

product, 119 

principal value, 128 

quotient, 120 

real part, 117 

root, 132 
Complex plane, 123 
Components, harmonic, 702 
Components of vector, 142 
Composition of functions, 59 
Concave function, 57 
Conditional probability, 13 
Conditionally convergent series, 535 
Conformal mapping, 471, 474 
Conjugate 

complex, 120 

harmonic, 469-471 
Constraint, 231, 582 
Continuity 

left-hand, 95 

of scalar function, 94, 100 

of vector function, 493 

piecewise, 315 

right-hand, 95 

uniform, 95 
Contour line, 67 
Convergence 

in the mean, 714, 727 

of Fourier series, 704, 713, 716 

of improper integral, 317 

of sequence, 79 

of series, 532, 535 
Convergence tests 

alternating, 545 

comparison, 536 

integral, 538 

nth root, 542, 589 

ratio, 540, 589 
Convex function, 57 
Coordinate system 

curvilinear, 42, 47 

left-handed, 136 

polar, 128, 243, 248 

right-handed, 136 

spherical polar, 267, 429 
Coordinate transformation, 424, 429 
Corrector formula, 621 
Countable set, 22 
Counter-example, 192 
Coupled oscillation, 686 
Cramer’s rule, 414 
Critical points, 201, 474 


Critically damped, 685 

Cup, 5 

Curl, 529 

Curvature, 500, 524 
radius of, 500, 524 

Curve, continuous, 449 

Curvilinear integral, 507 

Cut plane, 476 

Cyclic permutation, 136 


Damping, 683 
critical, 685 
normal, 684 
Definite integral, 307, 310 
of vector functions, 506 
Degree 
of differential equation, 596 
of freedom, 231 
of polynomial, 57 
Del, 516 
De Moivre’s theorem, 131 
De Morgan’s theorem, 7 
Density distribution 
area, 344 
linear, 343. 
Dependent variable, 41 
Derivative, 182, 495 
chain rule, 194, 240 
change of variable, 245 
directional, 516 ™ 
higher, 216, 253 
left-hand, 184 
mean value theorem for, 206, 261 
mixed, 255 
of complex function, 458, 460, 466 
of composite function, 192 
of product, 190 
of quotient, 196 
of sum, 189 
of trigonometric functions, 197 
partial, 223 
right-hand, 184 
total, 230 
Determinant, 153, 396, 400 
characteristic, 422 
co-factor, 400 
Laplace expansion theorem, 402 
minor, 400 
properties of, 399 
transpose of, 437 
Difference equation, 33, 35, 171 
Difference quotient, 178, 321 
Differentiable function, 181 
Differential, 213 
as linear transformation, 428 
geometry, 503 
total, 229, 630 


Differential equation 
algebraically homogeneous, 628 
arbitrary constants, 601, 634, 659 | 
Bernoulli, Daniel, 636 
Clairaut, Alexis Claude, 638 
comparison theorems, 647 ff. 
complementary function, 635, 657, 662 
degree, 596 
exact, 630, 631 
factorization theorems, 667 
higher order, 597, 656 
homogeneous, 601, 656 
inhomogeneous, 601, 662 
initial conditions, 601 
integrating factor for, 632 
linear first order, 601, 634 
linear independence of solutions, 658 ff. 
linear superposition of solutions, 657 
matrix, 687 
near homogeneous, 629 
order, 596 
ordinary, 596 
partial, 598 
particular integral, 635, 670, 671, 683 
series solution of, 678 ff. 
simultaneous, 677, 687 
solution, 597, 604 
undetermined coefficients, 663 ff. 
variables separable, 599, 626 ff. 

Differentiation 
of Fourier series, .730 
implicit, 232 


of integral with respect to a parameter, 


326 
of vector function, 495, 499 
operator, 218, 665 ff. 
parametric, 252 
Dilatation, 475 
Dimensional analysis, 180, 426 
Direction cosines, 137 
Direction ratios, 138 
Directional derivatives, 516 
Discontinuity, 101 
in Fourier series, 702, 709, 716 
of vector function, 494 
Disjoint sets, 4 
Distance function, 101 
Distribution, binomial, 19 
Distributive property, 24, 151, 386, 392 
Divergence 
of improper integral, 317 
of series, 78, 532 
operator, 529 
Domain, 41, 59 
Dot product, 147 
Double limit, 254 
Dummy variable, 209, 313 


INDEX / 759 


e, 86 
Echelon form, 416 
Eigensolution, 690 
Eigenvalue, 422, 690 
Eigenvector, 423 
Electromotive force (e.m.f.), 514 
Elementary row operation, 416 
Elliptic integral, 591 
Envelope, 61, 235, 639 
Error term 
in integration formula using derivatives, 
593 ᾿ 
in Simpson’s rule, 574, 594 
in trapezoidal rule, 544 
Euler’s method, 607 
modified, 618 
Even function, 56, 69, 707, 722 
Event, 9 
independent, 14, 15 
random, 9 
simple, 9 
Experiment, 9 
Exponential function, 89, 109, 271, 293 
Exponential theorem, 273 
Extremum, 199, 573 
constrained, 581 
identification of, 211, 221, 573 
of function of two variables, 578 


Factorization theorem for differential equa- 
tions, 667 
Factor of polynomial, 121. 
Family of curves, 47, 62, 234 
Fibonacci sequence, 35 
Field, 23 
lines, 520 
scalar, 515 
vector, 515 
Flow irrotational, 482, 513, 529 
Force as vector, 163 
Forcing function, 683 
Fourier coefficients, 701, 711, 714, 725 
Fourier components, 702 
Fourier series, 702, 710 
behaviour at discontinuity, 7 702, 709, 716, 
730 
change of interval, 719 
change of origin, 718 
complex, 725 
convergence, 704, 713, 716 
cosine, 722 
differentiation, 730 


\__ half range, 723 


“integration, 727 
mth partial sum, 705, 712, 716 
sine, 722, 723 
Freedom, degrees of, 231 
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Frequency of occurrence, 9 
natural, 684, 689 
Frobenius’s series solution, 680 
Function, 41 ff. 
absolute maximum, 199 
absolute minimum, 199 
algebraic, 58 
analytic, 469 
argument, 41 
bounded, 56 
from above, 56 
from below, 57 
composite, 59 
derivative of, 192 
concave, 57 
constant, 54 
continuous, 94, 100, 455, 457, 493 
continuous from the left, 94 
continuous from the right, 94 
convex, 57 
critical points of, 474 
derivative of, 182 
differentiable, 181 
discontinuous, 101 
domain of, 41 
even, 56, 707, 722 
exponential, 89, 271, 293 
gradient of scalar, 516 
harmonic, 469 
hyperbolic, 287 
implicit, 250 
inverse, 49, 51 
limit of, 91, 99, 102, 453, 493 
logarithmic, 281, 283 
monotonic, 50 
non-decreasing, 50 
non-increasing, 50 
odd, 56, 707, 722 
of complex variable, 452 
of one variable, 41, 46 
of several variables, 65 
of smaller order, 98 
order of, 82 | 
orthogonal, 701 
range of, 41 
rational, 57, 456 
regular, 469 
polynomial, 57 
step, 54 
transcendental, 58 
trigonometric, 52, 54, 294 
Fundamental interval, 700, 704, 707 
Fundamental theorems of calculus, 323 


Gaussian elimination, 420 
General solution, 610, 662 


Geometric progression (series), 40, 586, 711 


Gradient, 181, 183 
operator, 516, 518 
Graph, 3, 41 
Greatest lower bound (supremum), 57 
Grouping of terms in series, 548 


Half range expansion, 723 
Harmonic components, 702 
Harmonic conjugate, 469-471 
Harmonic function, 469 
Harmonic series, 535, 539, 587 
Higher order derivatives 

ordinary, 216 

partial, 253 
Homogeneous 

algebraic function, 628 

differential equation, 601, 656 

system of algebraic equations, 419 ff. 
Hyperbolic derivatives, 288 
Hyperbolic functions, 287 
Hyperbolic identities, 288 
Hyperbolic inverse functions, 291 


Identity 
between matrices, 388 
differentiation operator, 666 
transformation, 425 
Image, 42 
Imaginary number, 116 
Imaginary part, 117 
Implication, 8 
Implicit function theorem, 250 
Independent event, 15 
Indeterminate form, 93, 208, 262, 571 
Tndex 
column, 387 
row, 387 
Indicial equation, 656 
Induction, mathematical, 31 ff. 
Inequalities, 25, 713 
Infimum, 57 
Infinite product, 361 
Infinity, 2 
Inflection, point of, 201, 221, 616 


Inhomogeneous system of algebraic equa- 


tions, 414, 417 

solution by Cramer’s rule, 414 
Initial conditions, 34, 611 
Inner product, 147, 381, 391 
Integral 

as a function of a limit, 320 

curves, 607, 647. 

curvilinear, 507 

definite, 307 

differentiation under sign of, 326, 341 

equation, 652 


Integral, (contd.)— 
first mean value theorems, 320, 340 
improper, 317, 318 
indefinite, 322, 345 
inequalities, 319 
interpreted as area, 303, 307, 311 
limits of, 307, 313 
line, 507 
particular, 635, 670, 683 
properties of, 312 
Riemann, 310 
test, 538 
Integrand, 307 
Integrating factor, 632 
Integration 
by differentiation under integral sign, 
370 
by partial fractions, 362 
by parts, 356 
by substitution, 350, 353, 368 
numerical, 322-344, 342, 680 
of Fourier series, 727 
Intermediate value theorem, 197, 260 
Interpolating polynomial, 336 
Interpolation formula, 110, 336, 342 
Intersection of sets, 4 
Interval, 25, 28 
closed, 26, 42 
fundamental, 700, 704, 707 
infinite, 28, 42 
of convergence, 550 
of integration, 307, 338 
open, 26, 42 
semi-infinite, 28, 42 
semi-open, 26, 42 
Into mapping, 59 
Intrinsic equation, 500 
Inverse differential operator, 671 
Inverse function, 49, 51 
circular, 52; 54 
derivative of, 291 
hyperbolic, 288 
trigonometric, 52 
Inverse matrix, 407-410 
Inversion mapping, 478 
Irrational number, 1, 23, 38 
Irrotational flow, 513 
Isocline, 605 
Interated limit, 254 
Iteration, 576 
Picard’s method, 641 


Jacobian, 248, 267, 396, 429 
Jump discontinuity, 94 


Lagrange interpolation formula, 110, 336 
Lagrange multiplier, 583 
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Lagrange remainder, 564, 570 
Laplace’s equation, 469 
Laplace’s expansion rule, 402, 414 
Laplace’s transformation, 691 ff. 
Latent root, 422 
Latent vector, 423 
Law of the mean, 205 
Least squares, 584 
Least upper bound, 57 
Left-hand derivative, 184 
Left-handed system, 136 
Leibnitz’s theorem, 219, 561 
Legendre functions, 263 
Level curve, 67 
Level surface, 518 
L’Hospital’s rule, 208, 571 
Limit 
complex, 444, 447 
in definite integral, 307, 312, 313 
iterated, 254 
left-hand, 92 
of scalar function, 91, 96, 99 
of sequence, 79 
of vector function, 493 
point, 77, 448 
reversal in integral, 313 
right-hand, 92 
Line integral, 507 
Line element, 509 
Linear algebra, 378 
Linear density distribution, 343 
Linear dependence, 404, 405, 658 
Linear homogeneous differential equation, 
656 ff. 
Linear independence, 404, 405, 658 
Linear inhomogeneous differential equa- 
tion, 661 ff. 
Linear system of equations, 413 
Linear transformation, 384, 424, 428, 474 
Linear superposition of solutions, 657 
Logarithmic decrement, 698 
Logarithmic function, 281, 283, 324 
Lower sum, 303 


Maclaurin series, 558 
convergence of, 561, 565 
Majorizing series, 88 
Mapping, 42, 59, 385 
bilinear, 480 
conformal, 472 
into, 59 
inversion, 478 
linear fractional, 480 
many~one, 43 
one-—many, 59 
one—one, 43 
onto, 59 
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Mathematical induction, 31 
Matrix, 380, 387 
addition, 379, 389 
adjoint, 406 
augmented, 416 
coefficient, 384, 414 
derivative of, 410, 412 
diagonal, 387 
element, 385, 387 
equality, 388 
function, 410 
inverse, 407, 408 
multiplication, 381, 383, 389, 390 
null, 388 
order of, 380, 387 
orthogonal, 424 
rank, 405 
rectangular, 382 
skew-symmetric, 388 
square, 384, 387, 406 
subtraction, 379, 390 
symmetric, 388 
unit, 388 
Maximum, 199 
Mean value theorem for derivatives, 206 
extended form, 261 
generalized, 568, 570 
Mean value theorem for integrals 
first, 320 
second, 340 
Milne’s method, 621 
Minimum, 199 
Minor, 400 
Mixed derivatives, 255 
Modulus 
of complex number, 127, 294 
of vector, 144 
Moment 
of inertia, 343, 344 
of momentum, 167 
of vector, 166 
Momentum 
angular, 512 
linear, 512 
Monotonic function, 50 
Monotonic sequence, 75 
Multiplicity of root, 658, 661 


Nabla, 516 
Natural frequency, 683 
Natural logarithm, 281 

as definite integral, 324 
Neighbourhood, 77, 99, 112, 448 
Newton’s first law of motion, 598 
Newton’s gravitational law, 529 
Newton’s law of cooling, 623 
Newton’s method, 575 


Node, 611 
Non-differentiable function, 181 
Norm, 305 
Normal, 161, 500 
Normal damping, 684 
Normal mode, 691 
Number 
absolute value of, 28 
basis of system, 29 
binary, 30 
complex, 115 
imaginary, 1, 23, 116 
irrational, 1, 38 
rational, 22 
real, 23 
representation of, 29 
natural, 1, 21 
ordering of, 25 
Numerical integration, 332- 334, 342, 680 


Odd function, 56, 69, 707, 722 
One-one mapping, 43 
One-sided derivative, 184 
One-sided limit, 92 
Onto mapping, 59 
Operator D, 219, 674 ff. 

inverse, 671 

simple rules for, 672 ff. 
Open interval, 26 
Order 

of a determinant, 154 

of a differential equation, 596 

of a function, 82 

of a matrix, 380, 387 
Ordered pair, 3 
Ordered triple, 3 
Ordering of numbers, 25 
Orthogonal axes, 135 
Orthogonal functions, 701 
Orthogonal matrix, 424 
Orthogonal projection, 137 
Orthogonal trajectories (curves), 617 
Oscillatory solution, 683 ff. 
Over-damped, 685 


Parabola of safety, 265 

Parallel vectors, 143 

Parallelepiped, 175 

Parallelogram rule, 125 

Parameter, 47, 61 

Parametric differentiation, 252 
Parameterization, 61, 62, 329, 492, 602 
Parseval’s relation, 713, 727 

Partial derivative, 223 

Partial differentiation operator, 569 
Partial fractions, 363 ff. 

Partial sum, 78, 532 


Particle trajectories, 520 
Particular integral, 663, 670, 671, 683 
Partition, 303. 

refinement of, 304 
Parts, integration by, 355 
Pascal triangle, 219 
Path of integration, 507 
Pencil of lines, 47 
Periodic extension, 700, 704, 712, 721 
Permutation, 17 

cyclic, 136 
Phase angle, 684 
Picard’s iterative method, 641 
Piecewise continuous, 315, 702 
Plane, equation of, 161, 176 
Point 

critical, 201 

of inflection, 201, 221, 616 

Stationary, 201 
Point transformation, 424 
Polar coordinates, 128, 248, 343 
Polygon 

Cauchy, 606 

of forces, 163 
Polynomial, 57, 456 

characteristic, 657 

factors of, 121 

interpolating, 336 

operator, 665 
Post multiplication, 391 
Potential 

complex, 482 

scalar, 519 

velocity, 482 
Power series, 549 
Predictor—corrector method, 619 
Predictor formula, 620 
Pre-multiplication, 391 
Primitive, 322 

complete, 662 
Principal branch, 52, 54 
Probability, 10, 13 

addition rule, 14, 37 

conditional, 13 

multiplication rule, 13, 37 
Product 

scalar, 147, 381 

triple, 155, 156 

vector, 151, 155 
Projection, 150 


Quadrature formulae 


based on interpolating polynomial, 336 


error in, 574, 594 
Simpson’s rule, 334, 342 
trapezoidal rule, 333 
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Radius of convergence, 550, 551 
Radius of curvature, 500, 524 
Random event, 9 
Range, 41, 59 
Rank, 405, 418, 427 
Ratio test, 540, 589 
Rational function, 57, 362, 456 
Rational number, 22 

representation of, 29 
Real number, 23 

axioms, 24 
Réarrangement of series, 548 
Recurrence relation, 85, 272, 274 
Reduced equations, 656 
Reduction formula, 358 
Refinement of partition, 304 
Reflection in line, 50 
Regions, interior and exterior, 448. 
Regular function, 469 
Relative maximum, 199 
Relative minimum, 199 
Relative velocity, 165 
Remainder term, 538, 541; 543, 546, 563 
Resonance, 684 
Resultant of forces, 163 
Riemann integral, 310 ff. 
Right-handed system, 136 
Rolle’s theorem, 203 
Roots 

by Newton’s method, 575 

multiplicity, 658, 661 

of a complex number, 132 
Rotation 

mapping, 472 

matrix, 386 
Row operation, elementary, 416 
Runge-Kutta method, 680 ff. 


Saddle point, 578, 611 
Scalar, 115 
potential, 519 
product, 147, 381 
Separation of variables, 599, 626 
Sequence, 74 
bounded, 75, 108 
complex, 444 
convergence of, 79, 445, 447 
limit of, 79, 102 
limit point, 77, 444 
monotonic, 75 
null, 84 
Series, 73, 531 ff. 
arithmetic, 40 
difference, 534 
divergent, 78, 550 
geometric, 40, 533, 586 
Maclaurin, 558, 561 
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Series, (contd.)— 
majorizing, 88 


solution of differential equations, 678 ff. 


sum, 534 


summation by. Fourier series, 717 


Taylor, 558 


termwise differentiation, 555 


termwise integration, 555 
trigonometric, 700 


Serret—-Frenet equations, 504, 525 


Set, 1 
complement of, 6 
finite, 2 
infinite, 2 
intersection, 4 
null, 4 
symmetric difference, 36 
union, 5 
Simpson’s rule, 334, 342 
error in, 574, 594 


Simultaneous differential equations, 677, 


682, 687 
Singular point, 610 
Singularity, 318, 469 
Space, sample, 11, 13 
Sphere, 162 


Stationary point, 201, 578, 584 


Step function, 54 
Straight line, 157 
Stream function, 482 
Streamlines, 520 
Stress tensor, 430 
Subsequence, 77 
Subset, 4 


Substitution, integration by, 349, 366 


Supremum, 57 
Surface of revolution, 331 


Tangent 

plane, 518, 528 

unit vector, 497 
Taylor’s polynomial, 567 
Taylor’s series, 558 
Taylor’s theorem, 563 ff. 
Tensor, stress, 430 
Terminal velocity, 646 
Torque, 167 
Torsion, 503 
Total derivative, 230 
Total differential, 229 
Trajectory, orthogonal, 617 
Transformation, 42 

bilinear, 480 

dilatation, 425 

identity, 425 

inverse, 51 

Jacobian of, 248, 267 


Transformation, (contd.)— 
Laplace, 691 ff. 
linear, 428, 430, 474 
linear fractional, 480 
of boundary conditions, 693 
pairs, 692 
translation, 425 
Transcendental function, 58 
Transient behaviour, 684 
Translation, 472, 475 
Transposition operation, 394 
Trapezoidal rule, 333 
error of, 594 
Trial, 9 
Bernoullian, 19 
independent, 18 
Triangle inequality, 127 


Trigonometric functions, 54, 294 


inverse, 52, 54 
principal branch, 54 


Trigonometric identities, 295, 301 


Trigonometric series, 700 


Two point boundary condition, 603 


Unbounded function, 57 
Undetermined coefficients 
in partial fractions, 365 


in solution of differential equations, 663 ff. 


Uniform continuity, 95, 554 
Union of sets, 5 


Uniqueness of solution of differential equa- 


tion, 274, 602 
Unit circle, 6 
Unit vector, 137 
binormal, 500 
normal, 161, 500 
tangent, 497 
trial, 141 
Unity, mth roots of, 133 
Upper bound, 57 
Upper sum, 303 


Variable 
change of, 245, 348 
dependent, 41 
dummy, 209, 313, 704, 711 
independent, 41 
Pi, 427 
Variation of parameters, 675 
Vector, 115, 124, 135 ff. 
acceleration, 498, 511 
addition, 142 
angle between, 146, 148 
basis of, 141 
binormal, 500 
bound, 125 
components of, 142 


Vector, (contd.)— 
difference of, 142 
direction cosines, 137 
direction ratios, 138 
equality of, 142 
free, 125 
function, 493 
matrix, 382, 387 
modulus of, 144 
moment of, 166 

- normal, 500 
parallel, 143 
position, 142 
scalar product of, 147 
tangent, 497 
triangle, 125 
vector product of, 151, 155, 156 

Velocity 
angular, 166 
potential, 482 
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“Velocity, (contd.)— 


radial, 511 

relative, 165 

transverse, 511 

vector, 497 
Venn diagram, 5 
Vibration, mechanical, 600, 683, 686 
Volume of revolution, 332 


Wallis infinite product, 361 
Work 
as line integral, 513 
done by force, 165 
Wronskian, 657, 676 


Zero, 58 
location by Newton’s method, 575 
vector, 142 
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